C/C++ Compilers/Shared Library

From Software Engineers Wiki
Jump to: navigation, search

How to build a shared library, and how a program and a shared library link?

Answer

Let's say we have a function to be built into a shared library.

my_string.h

#ifndef __MY_STRING_H
#define  __MY_STRING_H

#ifdef __cplusplus
extern "C" {
#endif

char *str_reverse(char *str);

#ifdef __cplusplus
}
#endif

#endif

my_string.c

#include "my_string.h"

char *str_reverse(char *str)
{
        char *head = str, *tail;
        char tmp;

        for (tail = str; *tail != '\0'; ++tail)
                ;
        --tail;

        while (head < tail) {
                tmp = *head;
                *head++ = *tail;
                *tail-- = tmp;
        }

        return str;
}

We compile each file, then build them into a shared library.

gcc -c -O2 my_string.c
gcc -shared -o my_string.so my_string.o

main.c

#include <stdio.h>

#include "my_string.h"

int main(void)
{
        char str[] = "Hello, world!";

        printf("%s => ", str);
        str_reverse(str);
        printf("%s\n", str);

        return 0;
}

Then compile into an executable using the shared library.

gcc -c -O2 main.c
gcc -o main main.o my_string.so

In the main binary

# readelf --dynamic main

Dynamic section at offset 0xe40 contains 21 entries:
  Tag        Type                         Name/Value
 0x0000000000000001 (NEEDED)             Shared library: [my_string.so]
 0x0000000000000001 (NEEDED)             Shared library: [libc.so.6]
 0x000000000000000c (INIT)               0x4005b8
 0x000000000000000d (FINI)               0x400868
 0x000000006ffffef5 (GNU_HASH)           0x400298
 0x0000000000000005 (STRTAB)             0x400408
 0x0000000000000006 (SYMTAB)             0x4002d0
 0x000000000000000a (STRSZ)              194 (bytes)
 0x000000000000000b (SYMENT)             24 (bytes)
 0x0000000000000015 (DEBUG)              0x0
 0x0000000000000003 (PLTGOT)             0x600fe8
 0x0000000000000002 (PLTRELSZ)           120 (bytes)
 0x0000000000000014 (PLTREL)             RELA
 0x0000000000000017 (JMPREL)             0x400540
 0x0000000000000007 (RELA)               0x400528
 0x0000000000000008 (RELASZ)             24 (bytes)
 0x0000000000000009 (RELAENT)            24 (bytes)
 0x000000006ffffffe (VERNEED)            0x4004e8
 0x000000006fffffff (VERNEEDNUM)         1
 0x000000006ffffff0 (VERSYM)             0x4004ca
 0x0000000000000000 (NULL)               0x0

# readelf --relocs main

Relocation section '.rela.dyn' at offset 0x528 contains 1 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000600fe0  000500000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0

Relocation section '.rela.plt' at offset 0x540 contains 5 entries:
  Offset          Info           Type           Sym. Value    Sym. Name + Addend
000000601000  000100000007 R_X86_64_JUMP_SLO 0000000000000000 puts + 0
000000601008  000200000007 R_X86_64_JUMP_SLO 0000000000000000 __stack_chk_fail + 0
000000601010  000300000007 R_X86_64_JUMP_SLO 0000000000000000 str_reverse + 0
000000601018  000400000007 R_X86_64_JUMP_SLO 0000000000000000 __libc_start_main + 0
000000601020  000600000007 R_X86_64_JUMP_SLO 0000000000000000 __printf_chk + 0

Tested on ARM architecture.

# readelf --dynamic main

Dynamic section at offset 0x136cc contains 18 entries:
  Tag        Type                         Name/Value
 0x00000001 (NEEDED)                     Shared library: [my_string.so]
 0x0000000c (INIT)                       0x83c0
 0x0000000d (FINI)                       0x134a8
 0x00000019 (INIT_ARRAY)                 0x1b6bc
 0x0000001b (INIT_ARRAYSZ)               8 (bytes)
 0x0000001a (FINI_ARRAY)                 0x1b6c4
 0x0000001c (FINI_ARRAYSZ)               4 (bytes)
 0x00000004 (HASH)                       0x8014
 0x00000005 (STRTAB)                     0x827c
 0x00000006 (SYMTAB)                     0x80cc
 0x0000000a (STRSZ)                      283 (bytes)
 0x0000000b (SYMENT)                     16 (bytes)
 0x00000015 (DEBUG)                      0x0
 0x00000003 (PLTGOT)                     0x1b784
 0x00000002 (PLTRELSZ)                   40 (bytes)
 0x00000014 (PLTREL)                     REL
 0x00000017 (JMPREL)                     0x8398
 0x00000000 (NULL)                       0x0

# readelf --relocs main

Relocation section '.rel.plt' at offset 0x8398 contains 5 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
0001b790  00000216 R_ARM_JUMP_SLOT   00000000   malloc
0001b794  00000816 R_ARM_JUMP_SLOT   00000000   __deregister_frame_inf
0001b798  00001016 R_ARM_JUMP_SLOT   00008404   str_reverse
0001b79c  00001916 R_ARM_JUMP_SLOT   00000000   __register_frame_info
0001b7a0  00001a16 R_ARM_JUMP_SLOT   00000000   free

In main

00008404       F *UND*  00000000 str_reverse

Disassembly of section .rel.plt:

00008398 <.rel.plt>:
    8398:       0001b790        muleq   r1, r0, r7
    839c:       00000216        andeq   r0, r0, r6, lsl r2
    83a0:       0001b794        muleq   r1, r4, r7
    83a4:       00000816        andeq   r0, r0, r6, lsl r8
    83a8:       0001b798        muleq   r1, r8, r7
    83ac:       00001016        andeq   r1, r0, r6, lsl r0
    83b0:       0001b79c        muleq   r1, ip, r7
    83b4:       00001916        andeq   r1, r0, r6, lsl r9
    83b8:       0001b7a0        andeq   fp, r1, r0, lsr #15
    83bc:       00001a16        andeq   r1, r0, r6, lsl sl

Disassembly of section .plt:

000083d8 <.plt>:
    83d8:       e52de004        push    {lr}            ; (str lr, [sp, #-4]!)
    83dc:       e59fe004        ldr     lr, [pc, #4]    ; 83e8 <_init+0x28>
    83e0:       e08fe00e        add     lr, pc, lr
    83e4:       e5bef008        ldr     pc, [lr, #8]!
    83e8:       0001339c        muleq   r1, ip, r3
    83ec:       e28fc600        add     ip, pc, #0
    83f0:       e28cca13        add     ip, ip, #77824  ; 0x13000
    83f4:       e5bcf39c        ldr     pc, [ip, #924]! ; 0x39c
    83f8:       e28fc600        add     ip, pc, #0
    83fc:       e28cca13        add     ip, ip, #77824  ; 0x13000
    8400:       e5bcf394        ldr     pc, [ip, #916]! ; 0x394
    8404:       e28fc600        add     ip, pc, #0
    8408:       e28cca13        add     ip, ip, #77824  ; 0x13000
    840c:       e5bcf38c        ldr     pc, [ip, #908]! ; 0x38c
    8410:       e28fc600        add     ip, pc, #0
    8414:       e28cca13        add     ip, ip, #77824  ; 0x13000
    8418:       e5bcf384        ldr     pc, [ip, #900]! ; 0x384
    841c:       e28fc600        add     ip, pc, #0
    8420:       e28cca13        add     ip, ip, #77824  ; 0x13000
    8424:       e5bcf37c        ldr     pc, [ip, #892]! ; 0x37c

Disassembly of section .got:

0001b784 <_GLOBAL_OFFSET_TABLE_>:
   1b784:       0001b6cc        andeq   fp, r1, ip, asr #13
        ...
   1b790:       000083d8        ldrdeq  r8, [r0], -r8   ; <UNPREDICTABLE>
   1b794:       000083d8        ldrdeq  r8, [r0], -r8   ; <UNPREDICTABLE>
   1b798:       000083d8        ldrdeq  r8, [r0], -r8   ; <UNPREDICTABLE>
   1b79c:       000083d8        ldrdeq  r8, [r0], -r8   ; <UNPREDICTABLE>
   1b7a0:       000083d8        ldrdeq  r8, [r0], -r8   ; <UNPREDICTABLE>

When the main function calls str_reverse()

    84a8:       e1a0000d        mov     r0, sp
    84ac:       ebffffd4        bl      8404 <_init+0x44>

It jumps to address 0x8404 in PLT (Procedure Linkage Table) section. The code at 0x8404 loads PC from [PC + 0x1338c]. The PC was at 0x840c, so it loads PC from (0x840c + 0x1338c = 0x1b798) which is in GOT (Global Offset Table) section. The GOT section is updated during lazy binding. In summary, GOT section has actual address of functions which was resolved during lazy binding, and PLT section has code that loads the address of each function. Whenever the program needs to call the functions in the shared library, it calls the function inside PLT section.

By default, GOT has offset of the beginning of PLT section. It has following codes.

    83d8:       e52de004        push    {lr}            ; (str lr, [sp, #-4]!)
    83dc:       e59fe004        ldr     lr, [pc, #4]    ; 83e8 <_init+0x28>
    83e0:       e08fe00e        add     lr, pc, lr
    83e4:       e5bef008        ldr     pc, [lr, #8]!
    83e8:       0001339c        muleq   r1, ip, r3

It saves LR, load LR from 0x83e8, which is 0x0001339c. It adds PC to LR, which brings LR to point to 0x1b784 (0x1339c + 0x83e8). It loads instruction at (LR + 0x08 = 0x1b78c). It has no instruction which will cause an exception. Then the lazy binding happens.

Personal tools
Namespaces

Variants
Actions
Navigation
Toolbox