Skip to content

Latest commit

 

History

History
631 lines (548 loc) · 16 KB

headers_and_libraries.md

File metadata and controls

631 lines (548 loc) · 16 KB
marp math theme footer
true
katex
custom-theme

Libraries

Today:

  • Different types of libraries
    • Header-only
    • Static
    • Dynamic
  • What is linking
  • When to use the keyword inline
  • Some common best practices

📺 Watch the related YouTube video!


Special symbols used in slides

  • 🎨 - Style recommendation
  • 🎓 - Software design recommendation
  • 😱 - Not a good practice! Avoid in real life!
  • ✅ - Good practice!
  • ❌ - Whatever is marked with this is wrong
  • 🚨 - Alert! Important information!
  • 💡 - Hint or a useful exercise
  • 🔼1️⃣7️⃣ - Holds for this version of C++(here, 17) and above
  • 🔽1️⃣1️⃣ - Holds for versions until this one C++(here, 11)

Style (🎨) and software design (🎓) recommendations mostly come from Google Style Sheet and the CppCoreGuidelines


Let's start with an example

Let's say we implement a new machine learning framework 😉

#include <vector>
#include <iostream>

[[nodiscard]] int
PredictNumber(const std::vector<int>& numbers) {
  // Arbitrarily complex code goes here.
  if (numbers.empty()) { return 0; }
  if (numbers.size() < 2) { return numbers.front(); }
  const auto& one_before_last = numbers[numbers.size() - 2UL];
  const auto difference = numbers.back() - one_before_last;
  return numbers.back() + difference;
}
// Many more similar functions.

int main() {
  const auto number = PredictNumber({1, 2});
  if (number != 3) {
    std::cerr << "Our function does not work as expected 😥\n";
    return 1;
  }
  return 0;
}

What if we want to use it in multiple places?

  • For now code lives in a single binary
  • Now assume that we have two programs we want to write:
    • One to predict the house pricing
    • One to predict the bitcoin price
  • These should use our "machine learning" functions
  • And other things special for those usecases
  • 😱 Should we just copy the code over?

😱 Problems with copying?

  • Our code is duplicated
  • If we have more binaries, we have more copies
  • Any changes for the functionality needs to be synced
  • It requires us to keep this in mind - which is error prone
  • (violates the DRY principle)

center w:700


Better solution: header files!

ml.h

#pragma once  // Stay tuned 😉
#include <vector>
[[nodiscard]] inline  // Stay tuned for "inline"
int PredictNumber(const std::vector<int>& numbers) {
  // Compute next number (skipped to fit on the slide)
  return next_number;
}

predict_housing.cpp

#include <ml.h>
#include <iostream>
int main() {
  const auto prices =
    MagicallyGetHousePrices();
  std::cout
    << "Upcoming price: "
    << PredictNumber(prices);
  return 0;
}

predict_bitcoin.cpp

#include <ml.h>
#include <iostream>
int main() {
  const auto prices =
    MagicallyGetBitcoinPrices();
  std::cout
    << "Upcoming price: "
    << PredictNumber(prices);
  return 0;
}

Yay! A header-only library! 🎉

  • All functions are implemented in header files (.h, .hpp)
  • We #include these header files in our binaries
  • 💡 Put your includes first, then other libraries, then standard

Pros:

  • Compiler sees all code so it can optimize it well
  • Compilation remains simple (just need the new -I flag)
    c++ -std=c++17 -I folder_with_headers binary.cpp

Cons:

  • We always recompile the code in headers
  • Changes in headers require recompilation of depending code
  • If we ship the code it remains readable to anyone
  • We should make the functions inline (stay tuned)

What's #pragma once?

  • A preprocessor directive that ensures the header in which it is written is only included once
  • There are compilers that don't support it, but most do
  • Alternative --- include guards For file file.h in folder/ they can be:
    #ifndef FOLDER_FILE_H_
    #define FOLDER_FILE_H_
    
    #endif /* FOLDER_FILE_H_ */
  • They also ensure the header file will be included only once
  • ✅ Always use one of these in your header files!

Avoid long compilation times by using binary libraries!

  • Move only declarations to header files: *.h or *.hpp
  • Move definitions to source files: *.cpp or *.cc
  • Compile corresponding source files to object files
  • Bind the object files into libraries
  • Link the libraries to executables
  • The library is built once, and linked to multiple targets!
  • If we change the code in a library we only need to:
    • Rebuild only the library
    • [maybe] Relink this library to our executables

Declaration: ml.h

#pragma once
#include <vector>
[[nodiscard]] int
PredictNumber(const std::vector<int>& numbers);

Definition: ml.cpp

#include <ml.h>
#include <vector>
[[nodiscard]] int
PredictNumber(const std::vector<int>& numbers) {
  // Compute next number (skipped to fit on the slide)
  return next_number;
}

Calling it: predict_prices.cpp:

#include <ml.h>
#include <iostream>
int main() {
  const auto prices = MagicallyGetBitcoinPrices();
  std::cout << "Upcoming price: " << PredictNumber(prices);
  return 0;
}

Just build it as before?

c++ -std=c++17 predict_prices.cpp -I . -o predict_prices

Error: compiler sees only the declaration

Undefined symbols for architecture arm64:
  "PredictNumber(
    std::__1::vector<int, std::__1::allocator<int> > const&)",
    referenced from: _main in predict_prices-066946.o
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1
(use -v to see invocation)

💡 Your error will look similar but slightly different

🤔 Compile all together - solution?

c++ -std=c++17 -I . ml.cpp predict_prices.cpp

❌ not really - does not solve our "recompilation" issue!


Compile objects and libraries

  • Compile source files into object files (use the -c flag)
    c++ -std=c++17 -c ml.cpp -I includes -o ml_static.o
    c++ -std=c++17 -c -fPIC ml.cpp -I includes -o ml_dynamic.o
    Assuming that all includes live in the includes folder, results in *.o binary files that an OS can read and interpret
  • Pack objects into libraries:
    • Static libraries (*.a) are just archives of object files
      ar rcs libml.a ml_static.o <other_object_files>
    • Dynamic libraries (*.so) are a bit more complex
      c++ -shared ml_dynamic.o <other_object_files> -o libml.so
  • Finally, we link the libraries to our binary

Linking libraries to binaries

  • Linking tells the compiler in which binary library file to find the definition for a declaration it sees in a header file
  • Link our main executable to the libraries it uses
    c++ -std=c++17 main.cpp -L folder -I includes -lml -o main
    • -I includes - Headers are in the includes folder
    • -L folder - Add folder to the library search path
    • -lml - Link to the library file libml.a or libml.so
    • 🚨 Note that -l flags must be after all .cpp or .o files
    • 🚨 Same usage for both static and dynamic libraries but a different resulting executable
  • Static libraries are copied inside the resulting main binary
  • Dynamic libraries are linked to the resulting main binary

What's the difference between static and dynamic libraries?

  • Binaries with static linkage:
    • Contain binary code of other libraries, usually bigger
    • Can be copied anywhere on any similar operating system
  • Binaries with dynamic linkage:
    • Contain references to other libraries, usually smaller
    • Dependencies (dynamic libraries) are looked up at runtime
      • Relative to the current path
      • In the paths stored in LD_LIBRARY_PATH variable
    • If you move your binary or libraries you might break it
    • See linked libs with ldd (Linux) or otool -L (MacOS)
  • In this course we will use static libraries

Mixing header-only and compiled libraries requires caution 🚨

  • Let's assume we have a header file print.h:
    #include <iostream>
    // Notice no inline keyword here
    void Print(const std::string& str) { std::cout << str << "\n"; }
  • We use this file in two compiled libraries: foo and bar

foo.h

void Foo();

foo.cpp

#include <print.h>
void Foo() {
  Print("Foo");
}

bar.h

void Bar();

bar.cpp

#include <print.h>
void Bar() {
  Print("Bar");
}

Mixing header-only and compiled libraries requires caution 🚨

  • Finally, we write a program that uses them: main.cpp
    #include <foo.h>
    #include <bar.h>
    int main() {
      Foo();
      Bar();
      return 0;
    }
  • And compile it as an executable main:
    c++ -std=c++17 -c -I . main.cpp -o main.o
    c++ -std=c++17 -c -I . foo.cpp -o foo.o
    ar rcs libfoo.a foo.o
    c++ -std=c++17 -c -I . bar.cpp -o bar.o
    ar rcs libbar.a bar.o
    c++ main.o -L . -I . -lfoo -lbar -o main

❌ Oops, it does not link! (build it to see the error 😉)


But... why?

  • Linker failed because we violated ODR --- One Definition Rule
  • It states that there must be exactly one definition of every symbol in the program, i.e., your functions and variables
  • We have two libraries libfoo.a and libbar.a with source files that both include the print.h and therefore have a definition of the Print(...) function
  • Our executable links to both libfoo.a and libbar.a, so it has two definitions for the Print(...) function, which causes an error

inline to the rescue! 🦸‍♀️

  • ODR allows to have multiple definitions of inline functions (as long as all of them are in different translation units)
  • So adding inline to Print(...) will tell the compiler that we know there will be multiple definitions of it and we guarantee that they are all the same!
    #include <iostream>
    inline void
    Print(const std::string& str) { std::cout << str << "\n"; }
  • 🚨 inline can only be used in function definition
  • 💡 inline also hints to the compiler that it should inline a function --- copy its binary code in-place

We have to be careful!

Let's change our foo.cpp and bar.cpp a little

foo.cpp

#include <iostream>
inline void Print() {
  std::cout << "Foo\n";
}
void Foo() { Print(); }

bar.cpp

#include <iostream>
inline void Print() {
  std::cout << "Bar\n";
}
void Bar() { Print(); }

main.cpp

#include <foo.h>
#include <bar.h>
int main() {
  Foo(); Bar(); return 0;
}

Output of ./main?

Bar
Bar

😱

c++ -std=c++17 -c -I . foo.cpp -o foo.o && ar rcs libfoo.a foo.o
c++ -std=c++17 -c -I . bar.cpp -o bar.o && ar rcs libbar.a bar.o
c++ -std=c++17 main.cpp -L . -I . -lfoo -lbar -o main

Welcome back to the UB land!

bg center w:1100


What happened?

  • We have two functions with the same signature:
    void Print();
  • The definitions of this function are different and are in different translation units foo.cpp and bar.cpp
  • When we link them together into main the compiler sees multiple definitions and assumes they are the same
  • It picks the first one it sees and discard the other one

How to avoid errors?

  • Don't use inline in source files
  • ✅ Always use inline if you define functions in headers
  • ✅ Do the same for constants 🔼1️⃣7️⃣
    inline constexpr auto kConst = 42;
  • ✅ Use namespaces rigorously
  • ✅ Use unnamed namespaces in your source files for functions and constants used only within that source file

foo.cpp

#include <iostream>
namespace {
void Print() {
  std::cout << "Foo\n";
}
} // namespace
void Foo() { Print(); }

bar.cpp

#include <iostream>
namespace {
void Print() {
  std::cout << "Bar\n";
}
} // namespace
void Bar() { Print(); }

Summary

  • Use libraries to reuse/share your code
  • You have 3 options for libraries:
    • Header-only
    • Static
    • Dynamic
  • Each has their own benefits and downsides
  • In this course we will mostly use a combination of header-only and static libraries

bg