Hacker Newsnew | past | comments | ask | show | jobs | submit | dforsber's commentslogin

Shameless plug - Data Tap[1] is a custom made Lambda function with embedded DuckDB and AWS managed ingestion URL where you can HTTP post your data as much as you like. It will buffer your data and then use DuckDB SQL clause to land the datas Parquet on S3.

- Deploy to your own account (BYOC)

- "S3 first": Partitioned and compressed Parquet on S3

- Secure sharing of write access to the Data Tap URL

- 50x more cost efficient than e.g. "Burnhose" whilst also having unmatched scalability of Lambda (1000 instances in 10s).

[1] https://www.taps.boilingdata.com/ (founder)


Nobody says the obvious: Using DuckDB as the Catalog. You can easily do snapshots, and thus also time travel.


I wrote a custom C++ AWS Lambda runtime/handler together with DuckDB and to my surprise created very efficient and simple data streaming ingestion solution that outperforms others.


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: