Skip to content

Linking, compiling, undefined behavior, it's all fun! Here are some learning by doing examples that you don't see every day.

License

Notifications You must be signed in to change notification settings

daniel-falk/Fun-with-c

Repository files navigation

Fun with c

Some tutorials and playing around with c, defined and undefined behaviour, libs and cross-language linking.

Calling C++ from C

Calling a C library from C++ is as simple as adding the object file as a compiler argument and including the header file wrapped in a extern "C" {} block (more about this later..). However to call a C++ library from C demands a bit more thinking...

Lets create a simple C++ library:

// File: lib.cpp

#include <iostream>

void print_hey() {
    std::cout << "hello from c++! =)" << std::endl;
}

int add_ab(int a, int b) {
    return a + b;
}

Lets compile it to an object file:

$> g++ lib.cpp -c -o lib.o

We have now an object file lib.o, lets take a closer look at the text section of the ELF file using objdump

$> objdump -t lib.o | grep \.text 
0000000000000000 l    d  .text	0000000000000000 .text
0000000000000036 l     F .text	000000000000003d _Z41__static_initialization_and_destruction_0ii
0000000000000073 l     F .text	0000000000000015 _GLOBAL__sub_I__Z9print_heyv
0000000000000000 g     F .text	0000000000000022 _Z9print_heyv
0000000000000022 g     F .text	0000000000000014 _Z6add_abii

We can find out two library functions at the addresses 22 and 14. However the symbol names are not as we wrote them... This is name mangling. Name mangling is used in C++ to allow function overloading which is not allowed in C. The function names are prefixed, in this case using _Z9 and _Z6 and postfixed with v and ii. The v in the postfix stands for void, this is because the arguments to print_hey is nothing (void). The arguments to add_ab is int and ìnt and that is exactly where ii comes from.

Let's now try to make a C program which use this library.

// File: main.c

#include <stdio.h>

extern void print_hey();
extern int  add_ab(int, int);

int main() {
    print_hey();

    printf("1 + 2 = %d\n", add_ab(1, 2));
}

If we try to compile this main file together with the library we get unresolved symbols from the linker:

$> gcc main.c lib.o       
/tmp/ccVAwZNC.o: In function `main':
main.c:(.text+0xa): undefined reference to `print_hey'
main.c:(.text+0x19): undefined reference to `add_ab'
lib.o: In function `print_hey()':
lib.cpp:(.text+0xa): undefined reference to `std::cout'
lib.cpp:(.text+0xf): undefined reference to `std::basic_ostream<char, std::char_traits<char> >& std::operator<< <std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&, char const*)'
lib.cpp:(.text+0x14): undefined reference to `std::basic_ostream<char, std::char_traits<char> >& std::endl<char, std::char_traits<char> >(std::basic_ostream<char, std::char_traits<char> >&)'
lib.cpp:(.text+0x1c): undefined reference to `std::ostream::operator<<(std::ostream& (*)(std::ostream&))'
lib.o: In function `__static_initialization_and_destruction_0(int, int)':
lib.cpp:(.text+0x59): undefined reference to `std::ios_base::Init::Init()'
lib.cpp:(.text+0x68): undefined reference to `std::ios_base::Init::~Init()'
collect2: error: ld returned 1 exit status

However if we use g++ as compiler it succeds with no errors.

$> g++ main.c lib.o

The errors above might look terrifying, but let's start from the top. As I said earlier, C++ uses name mangling which C does not. Let's try to use the function names that we found in the text section of the library:

// File: main.c

#include <stdio.h>

extern void _Z9print_heyv();
extern int  _Z6add_abii(int, int);

int main() {
    _Z9print_heyv();

    printf("1 + 2 = %d\n", _Z6add_abii(1, 2));
}

If we compile this we find that the first two errors are gone, Yay! (Do I need to tell you that this is not the rigth way to do it?) The other errors all seems to be related to C++. This is because the C++ library assumes that libstdc++ is also used. Lets link it in!

$> gcc main.c lib.o -lstdc++

$> ./a.out 
hello from c++! =)
1 + 2 = 3

Hurray, it works! If we use ldd on the output binary we can see that it does now link to the systems stdc++ libraray:

$> ldd a.out 
	linux-vdso.so.1 (0x00007ffcf8ebe000)
	libstdc++.so.6 => /usr/lib/x86_64-linux-gnu/libstdc++.so.6 (0x00007f2bd8c91000)
	libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f2bd88e6000)
	libm.so.6 => /lib/x86_64-linux-gnu/libm.so.6 (0x00007f2bd85e5000)
	/lib64/ld-linux-x86-64.so.2 (0x00007f2bd8f9c000)
	libgcc_s.so.1 => /lib/x86_64-linux-gnu/libgcc_s.so.1 (0x00007f2bd83cf000)

Let's figure out a better way to deal with the name mangling... There is a keyword in C++ (actually a storage type) which is extern "C" which tells the compiler to use C type name mangling. Lets try it out!

// File: lib.cpp

#include <iostream>

extern "C"
void print_hey() {
    std::cout << "hello from c++! =)" << std::endl;
}

extern "C"
int add_ab(int a, int b) {
    return a + b;
}

Let's compoile it and again look at the object files text section:

$> g++ lib.cpp -c -o lib.o

$> objdump -t lib.o | grep \.text
0000000000000000 l    d  .text	0000000000000000 .text
0000000000000036 l     F .text	000000000000003d _Z41__static_initialization_and_destruction_0ii
0000000000000073 l     F .text	0000000000000015 _GLOBAL__sub_I_print_hey
0000000000000000 g     F .text	0000000000000022 print_hey
0000000000000022 g     F .text	0000000000000014 add_ab

We can now see that our functions has symbols with exactly the names we expected. Using this method we can return back to the first version of main.c (altough still linking to libstdc++) and we should be able to compile and run the program.

Using C name mangling we can not overload functions, but we can still have some C++ styled functions with the same name as the C styled function:

// File: lib.cpp

#include <iostream>

extern "C"
void print_hey() {
    std::cout << "hello from c++! =)" << std::endl;
}

void print_hey(int a) {
    std::cout << "hello int: " << a << std::endl;
}

void print_hey(double a) {
    std::cout << "hello double: " << a << std::endl;
}

extern "C"
int add_ab(int a, int b) {
    return a + b;
}
$> g++ lib.cpp -c -o lib.o && objdump -t lib.o | grep \.text
0000000000000000 l    d  .text	0000000000000000 .text
0000000000000070 l     F .text	000000000000003d _Z41__static_initialization_and_destruction_0ii
00000000000000ad l     F .text	0000000000000015 _GLOBAL__sub_I_print_hey
0000000000000000 g     F .text	0000000000000022 print_hey
0000000000000022 g     F .text	000000000000001c _Z9print_heyi
000000000000003e g     F .text	000000000000001e _Z9print_heyd
000000000000005c g     F .text	0000000000000014 add_ab

If we would try to add the storage type extern "C" to another on of them we get a compilation error:

// File: lib.cpp

#include <iostream>

extern "C"
void print_hey() {
    std::cout << "hello from c++! =)" << std::endl;
}

extern "C"
void print_hey(int a) {
    std::cout << "hello int: " << a << std::endl;
}

void print_hey(double a) {
    std::cout << "hello double: " << a << std::endl;
}

extern "C"
int add_ab(int a, int b) {
    return a + b;
}
$> g++ lib.cpp -c -o lib.o
lib.cpp: In function ‘void print_hey(int)’:
lib.cpp:11:21: error: conflicting declaration of C function ‘void print_hey(int)’
 void print_hey(int a) {
                     ^
lib.cpp:6:6: note: previous declaration ‘void print_hey()’
 void print_hey() {
      ^

This is as expected since the C++ name mangling is the only thing preventing two conflicting symbols.

The same logic applies to both .a static library files and .so dynamically linked files. You can find some example files in the cpp-and-c subfolder.

About

Linking, compiling, undefined behavior, it's all fun! Here are some learning by doing examples that you don't see every day.

Resources

License

Stars

Watchers

Forks