VDict - Definition of amulet

Computing (FOLDOC) dictionary (also found in English - Vietnamese, English - English (Wordnet), )

Amulet

Jump to user comments

processor An implementation or the Advanced RISC Machine

microprocessor architecture using the micropipeline design

style. In April 1994 the Amulet group in the Computer Science

department of Manchester University took delivery of the

AMULET1 microprocessor. This was their first large scale

asynchronous circuit and the world's first implementation of a

commercial microprocessor architecture (ARM) in asynchronouslogic.

Work was begun at the end of 1990 and the design despatched

for fabrication in February 1993. The primary intent was to

demonstrate that an asynchronous microprocessor can consume

less power than a synchronous design.

The design incorporates a number of concurrent units which

cooperate to give instruction level compatibility with the

existing synchronous part. These include an Address unit,

which autonomously generates instruction fetch requests and

interleaves (nondeterministically) data requests from the

Execution unit; a Register file which supplies operands,

queues write destinations and handles data dependencies; an

Execution unit which includes a multiplier, a shifter and an

ALU with data-dependent delay; a Data interface which

performs byte extraction and alignment and includes an

instruction prefetch buffer, and a control path which

performs instruction decode. These units only synchronise

to exchange data.

The design demonstrates that all the usual problems of

processor design can be solved in this asynchronous framework:

backward instruction set compatibility, interrupts and

exact exceptions for memory faults are all covered. It

also demonstrates some unusual behaviour, for instance

nondeterministic prefetch depth beyond a branch instruction

(though the instructions which actually get executed are, of

course, deterministic). There are some unusual problems for

compiler optimisation, as the metric which must be used to

compare alternative code sequences is continuous rather than

discrete, and the nondeterminism in external behaviour must

also be taken into account.

The chip was designed using a mixture of custom datapath and

compiled control logic elements, as was the synchronous ARM.

The fabrication technology is the same as that used for one

version of the synchronous part, reducing the number of

variables when comparing the two parts.

Two silicon implementations have been received and preliminary

measurements have been taken from these. The first is a 0.7um

process and has achieved about 28 kDhrystones running the

standard benchmark program. The other is a 1 um

implementation and achieves about 20 kDhrystones. For the

faster of the parts this is equivalent to a synchronous ARM6

clocked at around 20MHz; in the case of AMULET1 it is likely

that this speed is limited by the memory system cycle time

(just over 50ns) rather than the processor chip itself.

A fair comparison of devices at the same geometries gives the

AMULET1 performance as about 70% of that of an ARM6 running

at 20MHz. Its power consumption is very similar to that of

the ARM6; the AMULET1 therefore delivers about 80 MIPS/W

(compared with around 120 from a 20MHz ARM6). Multiplication

is several times faster on the AMULET1 owing to the inclusion

of a specialised asynchronous multiplier. This performance is

reasonable considering that the AMULET1 is a first generation

part, whereas the synchronous ARM has undergone several design

iterations. AMULET2 (currently under development) is expected

to be three times faster than AMULET1 - 120 kdhrystones -

and use less power.

The macrocell size (without pad ring) is 5.5 mm by 4.5 mm

on a 1 micron CMOS process, which is about twice the area of

the synchronous part. Some of the increase can be attributed

to the more sophisticated organisation of the new part: it has

a deeper pipeline than the clocked version and it supports

multiple outstanding memory requests; there is also

specialised circuitry to increase the multiplication speed.

Although there is undoubtedly some overhead attributable to

the asynchronous control logic, this is estimated to be closer

to 20% than to the 100% suggested by the direct comparison.

AMULET1 is code compatible with ARM6 and is so is capable of

running existing binaries without modification. The

implementation also includes features such as interrupts and

memory aborts.

The work was part of a broad ESPRIT funded investigation

into low-power technologies within the European OpenMicroprocessor systems Initiative (OMI) programme, where

there is interest in low-power techniques both for portable

equipment and (in the longer term) to alleviate the problems

of the increasingly high dissipation of high-performance

chips. This initial investigation into the role asynchronouslogic might play has now demonstrated that asynchronous

techniques can be applied to problems of the scale of a

complete microprocessor.

Home .

(1994-12-08)

Related search result for "Amulet"

Words contain "Amulet" in its definition in Computing (FOLDOC) dictionary:
jump trace buffer Amulet asynchronous logic

Comments and discussion on the word "Amulet"