Tuesday, October 26, 2010

Loading a memory address into a register (PowerPC)

Last week I was doing some programming on 64-bit PowerPC assembly (on an IBM 8204-E8A with AIX 5.3.0.0). The PowerPC architecture is a load/store architecture so rather than performing any operations on memory addresses directly, you have to first load your data into registers, operate on it, and then store it back into memory. I had quite a bit of trouble loading a memory address into a register, which is, as you can imagine, quite an obstacle to doing anything useful at all. ;-)
What I wanted to do was to have a static pointer. Therefore, I needed to store it somewhere so that its value would be preserved whenever I called the subroutine. There is not a lot of documentation for the PowerPC architecture on the Web, but from the resources that I found [1, 2], most cited the following method to loading 64-bit addresses into a register:

.csect data[rw]
var: .quad 1 
...

.csect text[pr]

...

lis 7, var@highest
ori   7, 7, var@higher
rldicr 7, 7, 32, 31
oris 7, 7, var@h
The AIX assembler simply didn't like those suffixes (@highest, @higher, @h) at all. It complained about the @-character not having any meaning within that context (error 1252-044.) I didn't want to give up, so I compiled a C program that would reproduce just the portion that I would need to understand how to do this. After reading the generated assembly, researching the IBM manuals and experimenting a bit, I came up with this simple solution using the TOC:
##  DATA section
.csect data[rw]
##  Table of contents
    .toc
T.var:  .tc     var{TC}, static_str

##  BSS section
.lcomm bss_static_str, 8, static_str

##  TEXT section
.csect text[pr]
.align 3

...
ld 11, T.var(2)  # load the addr of var into r11
To learn more about the complex subject of instruction formats and addressing, see [3]. I'll just attempt to explain this particular code segment without going into too much detail.
First of all, the table of contents (TOC) creates code that works regardless of where in memory the program is loaded at runtime ("position-independent code" [3]). With the .toc pseudo-op you define the table of contents for a module [4]. I took the naming convention of prefixing the TOC symbols with a capital T from the C compiler assembly output.
So with the .tc pseudo-op you instruct the assembler to create an entry in the TOC, which will refer to the static_str symbol. The static_str symbol is defined elsewhere, in the BSS section using the .lcomm pseudo-op, which defines an uninitialized block of storage [4]. The BSS segment contains uninitialized data for the program; defining this storage elsewhere (meaning in a another segment with a different psudo-op) will likely result in a segmentation fault when running the program.
Finally, and this is where I wanted to get, to load the address for that memory address into a register, you just need the plain and simple ld (load double word) instruction providing the name of the register to load the address to as the first operand (in PPC assembly the target register is generally the first operand) and the TOC reference to the variable as the second one. Notice that I provided the number 2 enclosed in parenthesis, as shown below:
ld 11, T.var(2)
This is the TOC-register, which is by convention register 2. It contains the address of the TOC from which the offset for the particular entry specified is computed. I generally set a symbol to reference this in a more readable way (as does the C compiler), as follows:
.set RTOC, 2
...
ld 11, T.var(RTOC)

The trick is then to reference a TOC symbol as an offset from the TOC address (register 2.) To achieve this, enter the name of the TOC symbol enclosing the TOC register in parenthesis (as shown before.) Note that this idiom refers to the address of the memory location rather than its contents!
Not that hard at all! ;-)

References:
[1] Assembly language for Power Architecture, Part 1 by Jonathan Bartlett (IBM)
[2] PowerPC assembly, by Hollis Blanchard (IBM)
[3] Assembly language for Power Architecture, Part 2 by Jonathan Bartlett (IBM)
[4] AIX 5L Version 5.3 Assembler Language Reference by IBM

No comments:

Post a Comment