Skip to content

Latest commit

 

History

History
27 lines (16 loc) · 948 Bytes

README.md

File metadata and controls

27 lines (16 loc) · 948 Bytes

cassandra-ETL

cassandra-ETL is an ETL project for an imaginary startup called Sparkify. This project create a NOSQL keyspace using Apache Cassandra.

For the purpose of the project we're going to insert the data using a CSV file with all the information that we need. This app will allows the analytic team knows what songs are users listening to.

Prerequisites

All libraries you need to install. I recommend you to use pip install

  • cassandra-driver
  • json
  • pandas
  • numpy

Getting Started

This project was created using Jupyter Notebook so first, make sure that you have all the tools to open .ipynb files

You can find all the code for the ETL pile in the file Project_1B_Cassandra.ipynb

The purpose of this project is to create tables based on 3 queries that the analytic team needs. You're going to find all the create statements, insert and select based on those 3 queries.

Feel free to ask question about the code!