Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building project takes a *long* time (esp compilation time for datafusion core crate) #13814

Open
Tracked by #13813
alamb opened this issue Dec 17, 2024 · 5 comments
Open
Tracked by #13813
Labels
enhancement New feature or request

Comments

@alamb
Copy link
Contributor

alamb commented Dec 17, 2024

Is your feature request related to a problem or challenge?

Compiling the datafusion crate currently takes 40 seconds on my machine, far longer than any other crate

This slows down CI builds as well as my own local development workflow

For example, running

# start from clean checking
rm -rf target
cargo build --timings

Generates a chart as follows (attached here): carg-timings.zip

Screenshot 2024-12-17 at 10 23 44 AM

Describe the solution you'd like

I would like to speed up compilation somehow -- likely by decreasing the time required for datafusion-core

Describe alternatives you've considered

I think the first thing would be to figure out if possible what is taking up so much time when building the core crate

I suspect it has to do with listing table / some of the various file format support, but I don't have data to justify that

Additional context

No response

@Omega359
Copy link
Contributor

install sccache and tell rust to use it

export RUSTC_WRAPPER=sccache
alias cargo="RUSTFLAGS='-Z threads=8' cargo +nightly"

With the above two options I see a full build on my machine taking 100.8s (1m 40.8s).

https://corrode.dev/blog/tips-for-faster-rust-compile-times/

@alamb
Copy link
Contributor Author

alamb commented Dec 17, 2024

With the above two options I see a full build on my machine taking 100.8s (1m 40.8s).

This makes sense -- thank you @Omega359

However, I think it will not help CI where the use of such caching doesn't help (and since the intermediate rust files are so large trying to use cargo cache wasn't effective in the past when we tried)

I would love to figure out how to break the datafusion core crate into smaller pieces / crates that can be compiled in parallel

@Omega359
Copy link
Contributor

fwiw I tried to gather a bit more info with llvm-lines:

cargo llvm-lines -p datafusion --lib | head -20

Lines                  Copies               Function name
  -----                  ------               -------------
  1220519                36986                (TOTAL)
    44221 (3.6%,  3.6%)    423 (1.1%,  1.1%)  <alloc::vec::Vec<T> as alloc::vec::spec_from_iter_nested::SpecFromIterNested<T,I>>::from_iter
    25426 (2.1%,  5.7%)    265 (0.7%,  1.9%)  alloc::vec::Vec<T,A>::extend_desugared
    22436 (1.8%,  7.5%)    142 (0.4%,  2.2%)  <arrow_array::array::primitive_array::PrimitiveArray<T> as core::iter::traits::collect::FromIterator<Ptr>>::from_iter
    21263 (1.7%,  9.3%)     45 (0.1%,  2.4%)  <core::iter::adapters::flatten::FlattenCompat<I,U> as core::iter::traits::iterator::Iterator>::size_hint
    16910 (1.4%, 10.7%)    151 (0.4%,  2.8%)  <core::slice::iter::Iter<T> as core::iter::traits::iterator::Iterator>::fold
    15949 (1.3%, 12.0%)    172 (0.5%,  3.2%)  alloc::vec::Vec<T,A>::extend_trusted
    12204 (1.0%, 13.0%)    116 (0.3%,  3.6%)  <core::iter::adapters::GenericShunt<I,R> as core::iter::traits::iterator::Iterator>::try_fold::{{closure}}
    10585 (0.9%, 13.8%)     88 (0.2%,  3.8%)  <alloc::vec::into_iter::IntoIter<T,A> as core::iter::traits::iterator::Iterator>::try_fold
    10438 (0.9%, 14.7%)    295 (0.8%,  4.6%)  <core::iter::adapters::map::Map<I,F> as core::iter::traits::iterator::Iterator>::next
     9702 (0.8%, 15.5%)     53 (0.1%,  4.7%)  alloc::vec::in_place_collect::from_iter_in_place
     8645 (0.7%, 16.2%)     13 (0.0%,  4.8%)  std::io::default_read_to_end
     8256 (0.7%, 16.9%)    116 (0.3%,  5.1%)  core::iter::adapters::try_process
     7939 (0.7%, 17.5%)    356 (1.0%,  6.0%)  alloc::boxed::Box<T>::new
     7673 (0.6%, 18.2%)    236 (0.6%,  6.7%)  core::iter::adapters::map::map_fold::{{closure}}
     7375 (0.6%, 18.8%)      1 (0.0%,  6.7%)  datafusion::physical_planner::DefaultPhysicalPlanner::map_logical_node_to_physical::{{closure}}
     7352 (0.6%, 19.4%)    241 (0.7%,  7.3%)  tokio::loom::std::unsafe_cell::UnsafeCell<T>::with_mut
     7317 (0.6%, 20.0%)    102 (0.3%,  7.6%)  core::iter::traits::iterator::Iterator::try_fold

@findepi
Copy link
Member

findepi commented Dec 18, 2024

I would love to figure out how to break the datafusion core crate into smaller pieces / crates that can be compiled in parallel

yes! and move around sqlparser dependency when doing so :)

@tustvold
Copy link
Contributor

tustvold commented Dec 19, 2024

FWIW the only reliable mechanism I've found to measure this is to comment out modules and measure the impact on compilation time. Llvm-lines and cargo-bloat can be informative, but not all lines are created equal. This is especially true for code with complex lifetimes and/or async.

Anything making extensive use of macros is likely a good place to start

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants