Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
My Knowledge Lakehouse (tabokie.github.io)
212 points by tabokie on Jan 14, 2024 | hide | past | favorite | 95 comments


Obsidian. Total game changer. Obsidian sync is completely worth paying for as well - firstly it helps support the project but also its slick, so well made, and easier than all the hacked together GitHub / s3 / whatever workarounds.

From a content point of view I personally find having structure but not too much structure is the way to go. A bit of yaml if needed but just a gentle “daily notes = meeting notes” (on my work obsidian) and “daily notes = journal” (on my personal one), a few tags but nothing too rigid or cumbersome.

Omnisearch is the dogs nuts plugin too, and a couple of others but again, keeping it quite minimal has really worked for me.

Just so nice knowing that everything is under my control, using just markdown but with this fantastically powerful framework underneath it. Huge fan.


I really wish that Obsidian Sync had like... 5 megs for free or something. I can't justify paying $8/month for a thing I haven't even started using, and without sync I have a hard time saying "OK let's try this" compared to Joplin or Notion.

There's a bunch of cool looking stuff though


You can use git and it isn't too difficult or unfriendly. But I totally get it. I'm paying for sync and I do love Obsidian, but it feels a bit steep. I do want them to stick around though, so, I just pay for it.


I come from different perspective: I would love to pay $15 per month if it gives me sync with my private storage and work on ios.

Luckily for both of us there is a solution https://github.com/vrtmrz/obsidian-livesync

Works well on ios.


I just store my vault in an iCloud Drive folder and it syncs across all my MacOS / IOS devices. I find it pretty quick to sync live as well. No added cost assuming you have enough iCloud storage to cover your vault.


One more cool thing about setting up with iCloud (see other comment) is you can use Apple Shortcuts or other apps to do cool things to the markdown files directly.


Any other sync solution will do the job - it's just text files.

If you just want to support the project, they have the Catalyst once-off payment that gives you beta access and VIP status on their Discord.


I did some quick git scripts with cron on mac, and windows scheduler to automatically sync between locations. Unfortunately the only git sync plugins I found only worked with password based auth on android.

Aside from not being easily accessable by phone, this works and the .obsidian folder shares the environment + plugins ( minus one for open tab/window layout ).

If anyone has a solution to the lack of git android ssh auth problem I'm all ears though.


I was in a similar spot. After too much hand wringing I decided to sign up for monthly ($10/month) sync and give it a solid two months. These are important tools that I’m happy to pay for, but they do take some time to evaluate. I figured $20 was a reasonable amount to commit to.

FWIW I ended up going all in on Obsidian and paying yearly, but that outcome was far from a forgone conclusion, even a month in. I went through a full cycle of honeymoon phase, then discovering warts, and then workarounds, etc. It does take time.


It's just a folder of plaintext files, you can do whatever you want with it, including pointing your Dropbox-like service to it.

I also pay for sync because I genuinely love the product (though I pay $5/month early bird pricing), but thanks to their file-first philosophy, I'm not locked in.


I've been using syncthing to sync between my desktop, laptop, server, and phone for over a year and it works well.


I use iCloud instead and it works great.


What's your annual salary divided by 2,000 (VERY roughly your hourly rate), and how many months of the service would it buy you as you try it out?

Totally understand if you're not employed, or barely making ends meet, but given the forum I'm going to take a shot in the dark here.


I like Obsidian, but this response sounds like you just read the title and not the article.

I recommend reading the article, it was pretty interseting.


I did. I thought the beginning was great but it lost me after a while, hence my comments about “structure but not too much structure”


I haven't checked Obsidian in a while, can you explain why you love it so much? I hear it being discussed a lot.


I think it’s the balance of being “just markdown” on one hand - you can literally just point it at a folder and start typing; and the fact that it’s incredibly extendable on the other. The better known plugins like DataView give you really powerful list / sort / search if you want that, but you can also make it look however you want. I’m strangely snobby about my UIs, so being able to take a theme that’s almost right and then tweak the css is a big deal for me.

Finally - it’s just a fast and solid bit of engineering. Rarely / (never, actually?) crashes, shortcuts are really powerful, constantly being updated… all manner of great :-)


Was thinking the same thing reading this. I use my daily note as the main launchpad for everything I'm writing each day, I add random thoughts in there and link out to separate notes for bigger tasks/ideas/information.

The daily node is so useful because when I look at an old resource I can see from the back links exactly when I had taken notes on it before and go to that daily note to see any additional context of what I was doing that day.

I use Trello for task management and jotting down things to investigate later.


> A bit of yaml if needed but just a gentle “daily notes = meeting notes” (on my work obsidian) and “daily notes = journal” (on my personal one), a few tags but nothing too rigid or cumbersome.

Can you explain this one? I use it and the daily note thing bugs me since I would rather separate work and personal in the same value, but I did not know I could fix it.


Org-roam and git!


> 1.1221 Sometimes the primary tool is not available. An “always-on” secondary should take its place.

> 1.1222 Sometimes log from different tools or locations need to be “merged”, “persisted”.

These two are the main reasons I've stuck with Google Docs: It's available everywhere, and everything's always in-sync. Google already has all my info of value, so the incremental trust necessary to the threshold of log visibility is minimal.

I'll add one more requirement: I don't want to leave an unencrypted on-disk footprint containing my notes. This means I can access the entirety of my notes on any machine and the only thing I have to worry about is keylogging/screenscraping.

Google Doc is lacking in many respects, though. Linkability is nonexistent. Docs longer than 100 pages really struggle with latency on mobile. Searchability is bad (!) because you have to open the doc first to see the matches from within.

Really wish there was a self-hosted alternative with sync and encrypted storage that didn't result in sync errors. I've tried DEVONthink, Obsidian, LogSeq, Google Keep, Notion, NotesNook, GoodNotes, Samsung Notes, Loop, OneNote, Apple Notes, Org Mode, plain text files, and probably a dozen others... I'd say NotesNook is the best so far, with DEVONthink a close second, but, nothing beats the reliability, omnipresence and privacy of Google Docs.


Syncthing + org-mode is the winning combo for me


With org-roam and roam-ui you have a perfect visualization. It's a bit lacking on mobile, been trying logseq on and off but it's so buggy.


What about using Visual Studio Code with your notes in multiple Markdown files with automatic syncing to Google Drive via the desktop Google Drive app?

Searching multiple files, editing and organising files is quick and simple on desktop, and you can fallback to Google Doc on mobile when needed (Google Drive app lets you view .md files, there's no way to edit?). You can install extensions as well (like for doing inline maths) and if you already use Visual Studio Code for coding it's one less thing to learn.

I'm not familiar with Obsidian, which gets mention a lot. What does it improve on compared to the above if you don't need a mobile app or complex linking between Markdown files?


+1 for Obsidian or other platforms that support Markdown format natively. Being platform locked is terrible for knowledge management and Markdown has made me less concerned with future knowledge access in a post-Obsidian world. (Yes, I said it! There will be a post-Obsidian world.)


A couple years ago I switched from Google Docs to Obsidian.

Unfortunately, as other have mentioned, Google Docs omits some seriously impactful features.

All it would've taken, at the time, was collapsible/foldable headings for me to stick with Docs.

But since then, I've grown to appreciate the millions other things Obsidian has to offer, like the ease of developing plugins which, to me, make Obsidian feel like it's an OS within my OS.


I'm using it and syncing my notes to GitHub, these are technical notes


If you don't need to sync with mobile, then adding your `notes` directory to IDE projects is a great solution. I've been doing that for a long time.

Eventually I switched to Obsidian for mobile support (syncing with free 'Remotely Save' plugin using S3). There are 2 other features of Obsidian that I came to appreciate over time:

1. Daily Notes

2. Calendar via Full Calendar plugin that uses your daily notes as one of event sources. So I can freely mix my Google Calendar 'official' meetings with my personal timeslot allocations for current day.

It is possible to have both of this features in IDE too, but with Obsidian they come for free.


> It's available everywhere

Unless you are offline


Or somehow triggers their anti-spam system on service Y while you really care about service X, but they're all bundled under the same company Z, so being blocked on one service impacts your usage of all other services under the same company...


I think apps generally work offline which satisfies the OPs requirement of eventually consistent merging.


Joplin is great try it. I self host mine, it syncs between devices, encrypted notes.


I did! The notes are stored in a local DB (SQLite, IIRC) that were not encrypted on disk while the app was running.


What didn’t work for you with OneNote? It supports AES encryption and inter-page linking.


OneNote sync conflicts effectively destroyed a document for me, with changes splintered across both copies. All my Inking across the copies had unequal offsets making merging impossible. I had to transcribe everything to text manually and trash the document. This was back when the "new" OneNote was still "new".

The infinite document length paradigm also made it impossible to print and review notes; text would slice mid-height across pages.

I know people who swear by the desktop version of OneNote, though...


try acreom, free e2ee sync that works out of the box with a local-first setup.


Anytime a knowledge management system comes into my head I remember this joke:

https://www.reddit.com/media?url=https%3A%2F%2Fi.redd.it%2F2...


Well, if you s/apple\ notes/vi/g ...


There are some good ideas in here.

Personally I’ve abandoned all of my complex attempts at productivity and knowledge systems and have recently moved to the ever reliable txt file. You can format it however you want, use fun ascii art, or just plain paragraphs.

I’ve got a daily.txt where I prepend a new days section every day and put in whatever I want over the course of the day. I also create other files for courses I’m going through or projects as needed. The nice thing is using my editor I can make a file reference and hit gf and it’ll go right to a referenced file. But this “system” will keep on working regardless of what tools or system I happen to be using.


It's clearly a nice and simple way to write data, but what about the other way around? How do you manage to find information when you need it?


I use Apple Notes and don’t bother to organize anything unless it’s a whole bunch of related material that I’m going to use together—notes for an rpg session, say. All the rest, I just make sure that I give it a title that I’ll recognize when I see it, and include words that I’ll likely use if I go looking for it with search.

Been doing this for years, haven’t lost anything yet. It works so well that time organizing it would have been wasted.


Over the years I realized that the best tool for that is the one that’s easy to do it both on my phone and computer. That’s why I’ve stuck with Apple Notes for more temporary stuff or unorganized stuff and Notion for more structured concepts.

Idk how long these tools will exist but at least Notion allows exporting to markdown or PDF.


Having trouble with Apple notes syncing too slowly. Wife can update the shopping list but I won’t see it for 24+ hours. Unfortunately my life runs on it. Problem may be that my Notes file has become a 15G monstrosity?


“Reminders” is better for things like shared shopping lists (not to say Notes shouldn’t be working better for you for this use case, but Reminders is another option that’s specifically for that kind of use case)

[edit] in the latest version it’ll even organize your grocery lists by category, for easier shopping.


Have you tried using Notion?

For things that are mostly lists I prefer using Microsoft To Do though (I use it ever since it was called Wanderlist)


Like it but I use all the features of Apple Notes. Notion has only a few of them. Thanks though


I use reminders for shopping lists. They can also be shared. Hopefully they are not the same database as notes!


i've been doing the exactly same thing. Some risk is the data safety - years of log is attached to the icloud account is concerning.


There exist programs to export them to markdown or what have you. Dunno how well they handle embedded media. I do a lot of copy-pasting screenshots or embedding PDF pages… or entire pdfs.

[edit] “why screenshots?”

1) To record gui workflows, walkthrough-style.

2) to record whole screens of values from guis while preserving formatting perfectly (think: cloud dashboard vital stats screens for various resources)

3) to record short message exchanges from ephemeral messaging with all the formatting intact with zero extra effort. (Think: feature discussion in a periodically-cleaned chat channel; I can always turn it into text later if I need to, recording with screenshot is fast)

4) plus now that it’s almost as easy and reliable to copy-paste from images as from regular text, on macOS and iOS, why not?


Read data? lol it is usually just a write only system (aside reviewing the current info from the week).

But since it is all in one place I just do search queries using my editor to find what I need. I don’t use tags, but it would be very simple to just append some #tags in my content and search for those.


Although I extensively use org mode, I didn't follow the GP's approach precisely because of this searchability problem. These days, however, vector DBs and doing RAG is getting close to changing my mind. I want to do less and less organizing, and more dumping, and use vector search capabilities to find what I need.

Haven't set it up yet, but it's on my TODO list.


CTRL-F or grep seem like they would work well


I have recently started to play around with the lightweight Emacs mode, "HOWM" which was written about 20 years ago by a Japanese author, who still maintains it today.

The howm-mode shows you a summary view, comprised of information (just the title/heading) from the (configurable) last 20 notes you have made. Everything goes under ~/howm/YYYY/MM e.g. ~/howm/2024/01 for notes I am making this month.

It has a simple way to enter schedule, todo, deadline, etc. which it shows just below the header that has single-key commands (s for case-insensitive searching, c to create a new note, etc.) for searching, navigation, new notes.

The genius of howm is that you can write and have very simple back and forward links to any file, URL, or tag, any kind of text etc., and you write "fragments" that is, whatever the smallest unit of text you want - it just needs a title plus anything else you want to add to it.

When you save the file, you hit Ctrl-C plus ",," and are back in the summary view.

So you can have very small sized units/fragments, which are very fast to create and save, but due to the forward and back links and the ease of searching (which will collate in a temporary list, everything found), you can "create fragmentarily but view/search collectively" as the documentation puts it.

My next step is to author HOWM notes in AsciiDoc or Markdown format which can easily be fed (since each note is a separate text file) into a converter for fancier output.

Syncthing easily keeps things in sync between my desktop and laptop.

In the very best of FOSS tradition, someone else wrote up a full PDF manual, in English. https://github.com/Emacs101/howm-manual/blob/main/Howm_tutor...


This year I'm trying a new mechanism based around GitHub Issues (which I've been using for personal knowledge management for a few years already) - I create a new "planner" issue every day, then use that to make notes about what I want to do and what I've got done.

Hitting Command+N in Firefox on my Mac opens up my planner issue for the day, or creates it if one doesn't exist yet. I wrote up how that system works (a tiny bit of GitHub Actions + Pages magic) here: https://til.simonwillison.net/github-actions/daily-planner


I had a revelation that my ideas are worthless and the notes I took are also going to be worthless once I get familiar with X.

So I completely stopped taking notes. A to-do list plus a Google Drive for my son and our stuffs are good enough. The only problem is that we are not comfortable to put sensitive information on GD so probably build a NAS for that.


My experience has been the opposite. I used to envy people with amazing memories like it was a super power. Then I realized if I got into the habit of just writing stuff down, I'd effectively have that same superpower. It's the one thing that I can honestly say has changed my life.

I use obsidian.md for notes, which I have on phone and desktop. Nothing fancy. You just need something that's always with you, has great search, and ideally allows you to link notes together.


It’s not only about keeping memories but structuring them. Even journaling (about your thoughts and feelings) helps with making sense of what’s on your mind: https://youtu.be/FNJO1pZV-I8?si=Efqm5uAOSp9MSeOq


I don't really have amazing memories. I just found out most of my stuffs are worthless and do not worth being preserved.


Can work in a fast paced environment but if you work on projects or ideas with hard numbers for years its not applicable or your brain capacity is off the charts.


Yeah 100%, I forgot to mention the above is just for my private projects so a lot of notes went into as comments.

For work we do use Confluence but again the quality of our notes are not great. From my experience the most important notes are the ones that tag a PoC if issue X flares up.


My notes are either:

* Some information that I don't trust myself to remember

* A canvas to refine my own ideas

* Summary of others' works.

Now it's easier for me to connect dots across works, themes and disciplines. That works wonder for compression of knowledge inside my brain.


I use notes because my memory is only so good, and they can be useful to me to refer back to. I just use Apple Notes. Good enough for me. That said, I don't think any of my ideas are somehow worthy of being "archived" or the like for others. They are simply helpful to me.

I don't know when ideas without execution become thought of as so valuable. Ideas are a dime a dozen, the hard part is testing those ideas to see if they actually work.


I just create one markdown file per day. I wrote a custom obsidian plugin that displays the "timeline" as if it was one open file and creates the files in year/month/week subfolders.

But I don't use linking and all the other complexity. But if I ever needed it I could.

Then it's synced with next cloud and also available on my phone.

I like this the most because just as explained, the time line is the most accurate representation of reality. And I can very quickly scroll back to see what I did last week or do full text search.

Simple and useful!


I love the approach. Have you published the "timeline" plugin or is code available?


I haven't yet because its hacky at the moment and I want to clean it up first (once I have some spare time). But there are somewhat similar plugins already published, e.g.: https://github.com/Quorafind/Obsidian-Daily-Notes-Editor


Can you explain the timeline further? I don’t understand- just a list of links?


The idea is to create one markdown file per day but to be able to open them all in one editor view. Then one can scroll through all the days conveniently like a timeline. Something like this [0] plugin although I did create my own (more hacky) version for myself.

[0]: https://github.com/Quorafind/Obsidian-Daily-Notes-Editor


Reads a lot like logseq.


That was actually the inspiration for the journal view but I like obsidian more. One thing I do miss from logseq is also its PDF annotation feature. Although that also comes with quite a bit of complexity.


Why next cloud for sync?


Because I already use it for everything else such as contacts (CardDAV), calendar events and tasks (CalDAV) which syncs nicely with my android phone. It's not that hard to host if not many people use the instance, has always worked well for me.


This is the second piece of content I've ever seen using the same numbering system as Wittgenstein's Tractatus, the first of course being the Tractatus itself.


Can't wait for the followup, Knowledgebase Investigations in which he realizes that he has derived no use from his knowledgebase. Water!


Isn't this what Hypercore[0] is? But distributed? It's a append-only log, and then there's higher order abstractions built on top of this log. Hyperstore is a key value store that uses an append only log, hyperdrive is a filesystem, etc...

[0]: https://github.com/holepunchto/hypercore


My approximate journey (7,500 notes, 13% markdown, remainder .txt)

(pc, circa 2010) notepad > Evernote > notepad > (mac) > vim > Notational Velocity > NVUltra (still in) beta > "The Archive" (Zettelkasten) > (n)vim

- notes synced with Dropbox folder (could just as well use iCloud) - with vim fzf/telescope plugins for search

What finally flipped me back to vim was using a setup script to give me a useful set of plugins that did nice markdown highlighting out of the box, and then working out how to do things like auto-indenting, opening URLs in browser, renaming files more easily etc.

Slight annoyance: yet to find a way to search both filenames and file contents at the same time

Do systems that link from one note to another help me? Apparently not much. Quite a lot of effort to maintain the links, better to express hierarchy in filenames.

(still using Omnifocus for todo lists, but it can import/export Taskpaper text format)


I like the idea, and I do something similar but this has a lot of rules and feels more complex than potentially necessary. I personally use Obsidian.md as the tool for my Zettelkasten method. It provides back links and the like. I also create engineering journals for projects.

The only rule I have is to avoid unlinked notes.


Several comments about personal note taking systems. Here's mine. :)

    function note { mg +-1 "/home/user/notes/$(date "+%Y-%m-%d").txt"; }
"mg" is "Micro Emacs clone" and "+-1" opens the file at the bottom for appending, but of course it works equally fine with your choice of $EDITOR.

I like this setup because it's plaintext (or markdown if you prefer), easily sorts on filename, and easily greppable. I have a couple years of notes saved this way and it's fun to explore old notes.


I thought Knowledge Lakehouse was going to be a variation of a Memory Palace


This post probably has some kernels of value, but it suffers from the XY problem: it's a note taking system which provides the solution, but not the problem it's solving. Let's invert that and focus on the problem first. Below are the problems my personal system solves, followed by the solutions. My solutions may not fit your situation, but if you see yourself in the problems, maybe you'll be inspired to find the solution that works for you.

1. Formatting easily becomes a distraction, but a little formatting is vital, so I use markdown to write my plans and notes.

2. Mixing streams from different projects is confusing, so each project has its own workspace - either a folder or a distinct prefix on the file name.

3. Work or ideas often become irrelevant for a while as plans change, but I hate feeling like I might lose work that might be valuable later. So I have a "dump" file or folder where I can dump such things. Entries are typically dated so they can be referenced by other files (see below). I typically use a level 1 or 2 markdown section header to delimit entries.

4. I often have a "log" markdown file where I append descriptions of indisputably important project developments (as opposed to maybe-important-later stuff that goes in the dump file). Entries here are dated and delimited similarly to the dump file.

5. There is a "plan" file that I try to keep clean and concise. This is where I go to remember what I have resolved to do next.

6. A common issue with plan files is that they get bloated with discussion of current or past context, as you try to figure out what to do and how to do it. When this happens, the discussion is moved to the "log" or "dump" files and replaced with a "link" describing how to find it again (e.g. "see log date 2024-01-12 for more on this").

7. There is a master plan file that coordinates all the projects. Bloat in this file is moved down to individual project files and replaced with a link.

To recap: markdown; project workspaces; log and dump with dated entries; concise plan file; move bloat out of plan into log or dump and replace with a link; master plan file with similar bloat management strategy.

I have used this system for five years. It has developed over time, but all elements have been in use for at least two years.

I am a staff data scientist leading a team of a half dozen people that is overhauling the core analytics pipelines and models at a Fortune 100, and 80% of my day is IC work. I have this job not because anyone asked me to do it, but because I proved I could do it on my own time. I work normal hours, parent of two kids, working spouse. I don't have time to putter around. This system is a big part of how I make it work.


this is effectively a very geeked out zettelkasten method.

https://en.wikipedia.org/wiki/Zettelkasten


An interesting read and a good share.

Every time I see that other people are contemplating these subjects, I notice that they hit on many of my own considerations and miss many of the big picture things I consider essential.

I think this will evolve into tribalism of mental models for organizing information. More vi vs emacs energy to go around in the decades to come.


Whenever I try one of these apps I'm always impressed by the features but my entire life is in iCloud.

Not only that but all the people I connect with for notes/calendars/reminders are also in the Apple ecosystem and they aren't going to switch for me. This means that a lot of the extra features that Notion, Obsidian, etc. offer just aren't valuable for me.

It makes me annoyed that Apple's Notes app is so basic.


Maybe NotePlan[0] can sync with iCloud?

[0]: https://noteplan.co/


You can use iCloud to sync your Obsidian notes across your devices.

I find the app experience to be terrible tho.


Very cool ideas! Very simple concept, indeed!

It helped to pin down my own approach, which is a very similar workflow, with these differences:

- The log entries are flashcards with ID

- The structure / tree follows the scientific method, aka the glue is not ad-hoc.

- So far I need no lists, because I only learn things and I'm not tracking any activity


Thanks for posting this. Currently building out something that is consolidating events from numerous sources and then automatically processing of those.

This post helped to give me some additional ideas on the processing of those events.


Interesting, and something I’ve been thinking about. What is your architecture? Using any available workflow products for “retrieve/receive event from X, figure out what do, then do that in Y”?


Right now, since it’s currently being built out for a single user, I’m just using NestJS for the backend, TypeScript and Vite for the browser extension, Python for the daemon, Postgres for the DB and ELK for logging and embeddings storage.

I have a few sources built out right now and some fun utilities thanks to the browser extension, but in the end, I’m expecting to have 15-20 different data sources that are sent by the browser extension, collected by the daemon, and polled via crons in NestJS.

I’m taking ideas i gleaned from the AI-Town paper about taking “memories” and then generating “reflections” and then doing some other stuff on top of that and also building out a knowledge graph of everything i do, read, listen to, etc..

I’m currently figuring out the standardization of the disparate events into memories to then transform into reflections and knowledge graphs.

I’d love to hear more about what you’ve been thinking of and the issues you’ve had.

Edit: One cool thing you can do with the browser extension for this stuff is to send your cookies and headers to the database and then use in the cron jobs to get the data—think pandora.


I’m using capacities.io it is much more accessible than obsidian.


I hope more people share content like this in HN. Note taking is incredibly personal and as a result, I think the best ideas might be lying dormant and unshared.


I enjoyed this article, though not sure I can be bothered printing books.

The approach taken here is essentially event sourcing, btw. It's old and it works.


What do all these have to do with lakehouse? (and that is a lame term to begin with anyway)


I use the similar kind of structure in Developer Diary. I do it completely offline, no sync.


Anytype all day every day


I was very excited about this when it launched (in alpha). How has the product matured? Also, does it now have an API for easy import/export of data into the database?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: