dltHub

dltHub · 2025-03-14T06:15:44.943Z

Try the dltHub AI code assistant!

Softwareentwicklung

Supporting a new generation of Python users when they create and use data in their organizations

Jobs anzeigen Folgen

alle 26 Mitarbeiter:innen anzeigen

Info

Since 2017, the number of Python users has been increasing by millions annually. The vast majority of these people leverage Python as a tool to solve problems at work. Our mission is to make them autonomous when they create and use data in their organizations. For this end, we are building an open source Python library called data load tool (dlt). Our users use dlt in their Python scripts to turn messy, unstructured data into regularly updated datasets. It empowers them to create highly scalable, easy to maintain, straightforward to deploy data pipelines without having to wait for help from a data engineer. We are dedicated to keeping dlt an open source project surrounded by a vibrant, engaged community. To make this sustainable, dltHub stewards dlt while also offering additional software and services that generate revenue (similar to what GitHub does with Git). dltHub is based in Berlin and New York City. It was founded by data and machine learning veterans. We are backed by Dig Ventures and many technical founders from companies such as Hugging Face, Instana, Matillion, Miro, and Rasa.

Website: https://dlthub.com/
Externer Link zu dltHub
Branche: Softwareentwicklung
Größe: 11–50 Beschäftigte
Hauptsitz: Berlin
Art: Privatunternehmen
Gegründet: 2022

Orte

Primär

Berlin, DE

Wegbeschreibung

Beschäftigte von dltHub

Alle Beschäftigten anzeigen

Updates

dltHub

8.310 Follower:innen
1 Tag
Diesen Beitrag melden
dlthub education We've got 2 courses coming up soon, and a 3rd one in planning (dlt+ fabric) If you would like to participate, sign up here: https://dlthub.com/events

dltHub events

dlthub.com

Gefällt mir Kommentieren Teilen
dltHub hat dies direkt geteilt
The Scalable Way

28 Follower:innen
3 Tage
Diesen Beitrag melden
Is data ingestion giving you trouble? It doesn't have to be that complicated. 🔧 Despite advances in data engineering, ingestion remains a major pain point for data teams. Why? - UI-based tools lacking scalability for production. - Open-source solutions creating messy, hard-to-maintain code. - In-house solutions falling short on security, monitoring, and reliability. 💡 But there's a better way. Tools like dltHub and Prefect bring software engineering best practices to data ingestion, making scalable, code-first pipelines easier than ever. Learn how: - dlt defines connectors and pipelines as code. - Prefect handles orchestration with automation and scheduling. - Together, they enable robust, modular data pipelines. 👉 Read the full article and learn how to build robust, scalable data ingestion pipelines: https://lnkd.in/dXaBGTCY #DataEngineering #ELT #DataPipelines #DataPlatform

dlt and Prefect, a Great Combo for Streamlined Data Ingestion Pipelines

Gefällt mir Kommentieren Teilen
dltHub

8.310 Follower:innen
2 Tage
Diesen Beitrag melden
What do Reddit users say about migrating to Iceberg? Apache Iceberg is getting a lot of attention. People love its interoperability, freedom from vendor lock-in, and better scalability. But is it actually delivering on those promises, or is it just adding complexity? Here’s what folks in the trenches are saying: 💭 "We're using it as half the business is Athena on a data lake, and the other half is Snowflake and dbt boys. So Iceberg allows the silos to meet in the middle somewhat." – wallyflops 💭 "We have hundreds of terabytes of event data and need to remove some lines due to GDPR. Having a ton of metadata (which Iceberg basically is) and tools like hidden partitions, z-ordering, etc., helps a lot." – data_grind 💭 "We are switching (eventually over one or two years) to Iceberg. The goal? Allow data to be queried from other compute engines—Trino and Snowflake primarily." – SupermarketMost7089 💭 "Switching to Iceberg can be useful, but it really depends on your data stack and use case. A media company switched to Iceberg to enable analytics teams to ingest data from various sources without being tied to one specific processing engine." – Signal-Indication859 Thinking about making the switch? Your company might want to check out dlt's Iceberg options, whether you’re leaning towards Filesystem or Athena, they’ve got you covered. Or maybe Iceberg still feels like a bag of trade-offs? No worries—there’s a whole Reddit thread debating it. Jump in, see what people are saying, and let’s discuss.👇 #ApacheIceberg #DataEngineering #BigData #DataLakes
1 Kommentar

Gefällt mir Kommentieren Teilen
dltHub

8.310 Follower:innen
2 Tage
Diesen Beitrag melden
Iceberg, dlt, sqlmesh, serverless, modelling automation. Do i need to say more? Check out this beautiful project! https://lnkd.in/e2B6bRDJ

Mattias Thalén

Analytics Consultant
1 Woche Bearbeitet

🚀 Excited to share my serverless lakehouse implementation! I've built a complete data platform that demonstrates how to implement a serverless lakehouse architecture combining the HOOK methodology with Unified Star Schema for analytics. The project: - Transforms AdventureWorks data through a clean, three-layer Analytical Data Storage System (ADSS) architecture (DAS→DAB→DAR) - Uses the HOOK methodology for business alignment without complex ETL - Implements a full Unified Star Schema with extended Puppini Bridge functionality - Provides point-in-time analysis through intelligent temporal resolution - Runs completely serverless using DuckDB, Iceberg, SQLMesh by Tobiko, dlt by dltHub and Streamlit. This implementation demonstrates how we can achieve both technical excellence and business alignment in data modeling - without sacrificing either. The solution generates 200+ models programmatically via configuration, making it incredibly maintainable and extensible. If you're interested in business-aligned data modeling or implementing the HOOK methodology in practice, check out the repository! GitHub Repo: https://lnkd.in/dvS5-mwZ #DataEngineering #HOOK #DataModeling #DataArchitecture #DataWarehouse #Lakehouse #DuckDB #Serverless #Iceberg #UnifiedStarSchema #AnalyticalDataStorageSystem #SQLMesh #dlt #Streamlit

GitHub - mattiasthalen/adventure-works: Modern serverless lakehouse implementing HOOK methodology, Unified Star Schema (USS), and Analytical Data Storage System (ADSS) principles on Adventure Works. Features programmatic model generation, event-enhanced Puppini bridges, and temporal resolution across DAS/DAB/DAR layers.

github.com

Gefällt mir Kommentieren Teilen
dltHub

8.310 Follower:innen
3 Tage
Diesen Beitrag melden
How do you run dlt on Airflow? Check out this comprehensive guide from our Consulting Partners Untitled Data Company https://lnkd.in/eVv4WRdc

How to run dlt with Airflow

selectstarfrom.substack.com

Gefällt mir Kommentieren Teilen
dltHub

8.310 Follower:innen
3 Tage
Diesen Beitrag melden
Managed cloud databases markup compute by 35-70x. It was fine when humans ran queries. AI doesn’t need coddling. Iceberg lets you: ✅ Run queries anywhere (Trino, DuckDB, Athena—your choice) ✅ Cut insane compute costs ✅ Keep control over your data, not rent access to it It’s your budget. Stop setting it on fire. https://lnkd.in/efVyNpVt

Why Iceberg + Python is the Future of Open Data Lakes

dlthub.com

Gefällt mir Kommentieren Teilen
dltHub

8.310 Follower:innen
3 Tage
Diesen Beitrag melden
Make sure to check out the dlt-SQLmesh integration that scaffolds your project and takes you from loading to transforming in one CLI Check out this gem! https://lnkd.in/e9qX_Mjt
Tobias (Toby) Mao

Co-Founder and CTO @ Tobiko Data
5 Tage

Wow, it’s amazing what we’ve achieved in SQLMesh in the last two quarters! We’re at our Tobiko biannual offsite and Iaroslav Zeigerman is going through what our core team has accomplished. Amazing features like true multi engine support, linting, blue printing, ClickHouse athena, Snowflake dynamic tables, and dltHub integration! #SQLMesh is moving ahead at lightning speed. Can’t wait to see where we’ll be at the next offsite!!
Gefällt mir Kommentieren Teilen
dltHub hat dies direkt geteilt
Adrian Brudaru

Open source pipelines - dlthub.com
4 Tage
Diesen Beitrag melden
Migrate your data stack without breaking it. Iceberg is AI-ready, composable, and modular - just start using it! Read more: https://lnkd.in/e_EETaKV

Why Iceberg + Python is the Future of Open Data Lakes

dlthub.com

10 Kommentare

Gefällt mir Kommentieren Teilen
dltHub

8.310 Follower:innen
4 Tage
Diesen Beitrag melden
🔥Why pay cloud overhead? Test pipelines locally with BYOC. In data engineering, we often default to spinning up cloud instances, but there isn't always a need. Your local machine is ready to handle more than you might expect. BYOC (Bring Your Own Compute) lets you run workloads on your own machine instead of relying on cloud servers. How BYOC can help: 🔹 Local Development: Running workloads on your own hardware isn't just about saving costs - it's about transforming your development workflow. Think faster iterations, immediate feedback, and zero waiting time for cloud resources to spin up 🔹 Zero-Cost Iterations Test, fail, adjust, repeat - all without touching your cloud budget. It's like having an infinite sandbox for your data experiments. The workflow is beautifully simple: 1. Build your pipeline locally with dlt 2. Test, Query, and analyze with dlt Cache 4. Push to production only when everything's perfect Why this matters? - Faster development cycles - Complete control over your compute - No surprise cloud bills - Instant feedback loops What's particularly exciting is how this fits into modern data practices. No more hoping your code works in production - you've already proven it locally. Have you experimented with BYOC? Are you interested in trying it? Discuss below!

Gefällt mir Kommentieren Teilen
dltHub

8.310 Follower:innen
1 Woche
Diesen Beitrag melden
Try the dltHub AI code assistant!

Continue

4.603 Follower:innen
1 Woche

dltHub built a custom AI code assistant that helps you create and maintain dlt pipelines. We are particularly excited about their MCP server block, which brings in context about your pipeline runs, database tables, and more Check out their blog post: https://lnkd.in/gYCVRqx6

Initial dlt, dlt+ assistants & MCP server on Continue

dlthub.com

Gefällt mir Kommentieren Teilen

Jobs durchsuchen

Finanzierung

dltHub Insgesamt 1 Finanzierungsrunde

Letzte Runde

Pre-Seed 20. Aug. 2023

1.500.000,00 $

Investor:innen

Dig Ventures

Weitere Informationen auf Crunchbase

dltHub

Softwareentwicklung

Supporting a new generation of Python users when they create and use data in their organizations

Info

Orte

Beschäftigte von dltHub

Marcin Rudolf

General Purpose Tech Guy @ dltHub

Anton Burnashev

Software Engineer

Matthaus Krzykowski

Open Source Pythonic data movement at dltHub

Brad Heller

Building Tower. / I'm hiring!

Updates

Einfach anmelden, damit Sie nichts verpassen.

Ähnliche Seiten

DuckDB

dbt Labs

MotherDuck

Mage

Polars

Tobiko

Dagster Labs

Airbyte

Sandbrink Data Solutions

DataTalksClub

Jobs durchsuchen

Analyst-Jobs

Entwickler-Jobs

Autor-Jobs

Architekt-Jobs

Marketingleiter-Jobs

Sicherheitsingenieur-Jobs

Director-Jobs

Business Analyst-Jobs

Business Development-Jobs

Finanzierung