Big-Data with Apache Spark and Python.
-
Updated
Jun 28, 2024 - Python
Big-Data with Apache Spark and Python.
SQL stream processing, analytics, and management. We decouple storage and compute to offer instant failover, dynamic scaling, speedy bootstrapping, and efficient joins.
🏆 Spark4You Design patterns
Ophelia a PySpark analytics wrapper.
An open source framework for building data analytic applications.
Benchmarks for data processing systems: Pathway, Spark, Flink, Kafka Streams
LiveBeats : A live dashboard for real-time music streaming insights.
Real-Time Monitor Panel for Systems Infected by a Keylogger.
Project to stream real-time orders and apply some ETL pipelines & analytics using DataBricks, Kafka, AWS
WASP is a framework to build complex real time big data applications. It relies on a kind of Kappa/Lambda architecture mainly leveraging Kafka and Spark. If you need to ingest huge amount of heterogeneous data and analyze them through complex pipelines, this is the framework for you.
⏱ Real-Time Sentiment Analysis using PySpark and simulation of Twitter/X API using FastAPI
Enabling Continuous Data Processing with Apache Spark and Azure Event Hubs
Data Driven Sentiment Insight into Twitter(X) Trends | Kafka | Spark | Spark MLlib | Docker
Processing data streams with Kafka + Spark
Generate relevant synthetic data quickly for your projects. The Databricks Labs synthetic data generator (aka `dbldatagen`) may be used to generate large simulated / synthetic data sets for test, POCs, and other uses in Databricks environments including in Delta Live Tables pipelines
Cryptocurrency monitoring system. Includes data scraping (Selenium), processing (Spark) and visualization (grafana). Pub-Sub messaging using Kafka, persistence of processed data using PostgreSQL. Deployment with Docker Compose.
Data Accelerator for Apache Spark simplifies onboarding to Streaming of Big Data. It offers a rich, easy to use experience to help with creation, editing and management of Spark jobs on Azure HDInsights or Databricks while enabling the full power of the Spark engine.
.NET for Apache® Spark™ makes Apache Spark™ easily accessible to .NET developers.
a suite of benchmark applications for distributed data stream processing systems
Add a description, image, and links to the spark-streaming topic page so that developers can more easily learn about it.
To associate your repository with the spark-streaming topic, visit your repo's landing page and select "manage topics."