Hacker Newsnew | past | comments | ask | show | jobs | submit | more treesciencebot's commentslogin

You can try the models here:

(available without sign-in) FLUX.1 [schnell] (Apache 2.0, open weights, step distilled): https://fal.ai/models/fal-ai/flux/schnell

(requires sign-in) FLUX.1 [dev] (non-commercial, open weights, guidance distilled): https://fal.ai/models/fal-ai/flux/dev

FLUX.1 [pro] (closed source [only available thru APIs], SOTA, raw): https://fal.ai/models/fal-ai/flux-pro


> (available without sign-in) FLUX.1 [schnell] (Apache 2.0, open weights, step distilled): https://fal.ai/models/fal-ai/flux/schnell

Well, I was wondering about bias in the model, so I entered "a president" as the prompt. Looks like it has a bias alright, but it's even more specific than I expected...


You weren’t kidding. Tried three times and all three were variations of the same[0].

[0] https://fal.media/files/elephant/gu3ZQ46_53BUV6lptexEh.png


The interesting part is that it does a good Joe Biden, but Trump always looks weird and alien.

https://imgur.com/a/fgf6Jt3


What is the difference between schnell and dev? Just the kind of distillation?


>FLUX.1 [dev]: The base model

>FLUX.1 [schnell]: A distilled version of the base model that operates up to 10 times faster

It should also be noted that "schnell" is the German word for "fast".


Not quite right, per their github repo:

> Models

> We are offering three models:

> FLUX.1 [pro] the base model, available via API

> FLUX.1 [dev] guidance-distilled variant

> FLUX.1 [schnell] guidance and step-distilled variant


Schnell is definitely worse in quality, although still impressive (it gets text right). Dev is the really good one that arguably outperforms the new Midjourney 6.1


Also worth mentioning schnell is a 4-step model, so comparable to SD lightning in that regard


What's the difference between pro and dev? Is the pro one also 12B parameters? Are the example images on the site (the patagonia guy, lego and the beach potato) generated with dev or pro?


I think they are mainly -dev and -schnell. Both models are 12B. -pro is the most powerful and raw, -dev is guidance distilled version of it and -schnell is step distilled version (where you can get pretty good results with 2-8 steps).


what does guidance distilled mean?

something about pro must be better than dev or it wouldn't be made API-only, but what exactly, how does guidance distilling affect pro it and what quality remains in dev?


Requires sign-in with a GitHub account, unfortunately.


I think they may have turned on the gating some time after this was submitted to HackerNews. Earlier this morning I definitely ran the model several times without signing in at all (not via GitHub, not via anything). But now it says "Sign in to run".


i just updated the links to clarify which models require sign-in and which doesn't!


You can try the models here:

FLUX.1 [dev] (non-commercial, open weights, guidance distilled): https://fal.ai/models/fal-ai/flux/dev

FLUX.1 [schnell] (Apache 2.0, open weights, step distilled): https://fal.ai/models/fal-ai/flux/dev

FLUX.1 [pro] (closed source [only available thru APIs], SOTA, raw): https://fal.ai/models/fal-ai/flux-pro


Why is it called Black Forest Labs? Are you in the Schwarzwald? Your names sound German, but I see no imprint.


Try TOS, scroll 2 end.


This looks like the one that got leaked a couple weeks ago, so i guess they decided its better to open source at this point after the leak [0].

[0]: https://x.com/cto_junior/status/1794632281593893326


It is. The model.ckpt from petra-hi-small matches the official HF repo.

SHA256: 6049ae92ec8362804cb4cb8a2845be93071439da2daff9997c285f8119d7ea40


it was already planned for open-sourcing, the leak did not affect the plans in any way


Just to make it clear, the actual Stable Diffusion 3 (8B variant, the one they are exclusively serving as an API) that they announced in march is still closed source and there are no indications on when or if it will be released.


~38 TOPS at fp16 is amazing, if the quoted number if fp16 (ANE is fp16 according to this [1] but that honestly seems like a bad choice when people are going smaller and smaller even at the higher level datacenter cards so not sure why apple would use it instead of fp8 natively)

[1]: https://github.com/hollance/neural-engine/blob/master/docs/1...


For reference. The llama.cpp people are not going smaller. Most of those models run on 32 bit floats with the dequantization happening on the fly.


backlight is now the main bottleneck for consumption heavy uses. I wonder what are the main advancements that are happening there to optimize the wattage.


If the usecases involve working on dark terminals all day or watching movies with dark scenes or if the general theme is dark, may be the new oled display will help reduce the display power consumption too.


AMD gpus have "Adaptive Backlight Management" which reduces your screen's backlight but then tweaks the colors to compensate. For example, my laptop's backlight is set at 33% but with abm it reduces my backlight to 8%. Personally I don't even notice it is on / my screen seems just as bright as before, but when I first enabled it I did notice some slight difference in colors so its probably not suitable for designers/artists. I'd 100% recommend it for coders though.


Strangely, Apple seems to be doing the opposite for some reason (Color accuracy?), as dimming the display doesn't seem to reduce the backlight as much, and they're using a combination of software dimming, even at "max" brightness.

Evidence can be seen when opening up iOS apps, which seem to glitch out and reveals the brighter backlight [1]. Notice how #FFFFFF white isn't the same brightness as the white in the iOS app.

[1] https://imgur.com/a/cPqKivI


The max brightness of the desktop is gonna be lower than the actual max brightness of the panel, because the panel needs to support HDR content. That brightness would be too much for most cases


This was a photo of my MBA 15" which doesn't have an HDR capable screen afaik. Additionally, this artifacting happens at all brightness levels, including the lowest.

It also just doesn't seem ideal that some apps (iOS) appear much brighter than the rest of the system. HDR support in macOS is a complete mess, although I'm not sure if Windows is any better.


Please give me an external ePaper display so I can just use Spacemacs in a well-lit room!


Onyx makes a HDMI "25 eInk display [0]. It's pricey.

[0] https://onyxboox.com/boox_mirapro

edit: "25, not "27


I'm still waiting for the technology to advance. People can't reasonably spend $1500 on the world's shittiest computer monitor, even if it is on sale.


Dang, yeah, this is the opposite of what I had in mind

I was thinking, like, a couple hundred dollar Kindle the size of a big iPad I can plug into a laptop for text-editing out and about. Hell, for my purposes I'd love an integrated keyboard.

Basically a second, super-lightweight laptop form-factor I can just plug into my chonky Macbook Pro and set on top of it in high-light environments when all I need to do is edit text.

Honestly not a compelling business case now that I write it out, but I just wanna code under a tree lol


I think we're getting pretty close to this. The Remarkable 2 tablet is $300, but can't take video input and software support for non-notetaking is near non-existent. There's even a keyboard available. Boox and Hisense are also making e-ink tablets/phones for reasonable prices.


A friend bought it & I had a chance to see it in action.

It is nice for some very specific use cases. (They're in the publishing/typesetting business. It's… idk, really depends on your usage patterns.)

Other than that, yeah, the technology just isn't there yet.


If that existed as a drop-in screen replacement on the framework laptop and with a high refresh rate color gallery 3 panel, then I'd buy it at that price point in a heart beat.

I can't replace my desktop monitor with eink because I occasionally play video games. I can't use a 2nd monitor because I live in a small apartment.

I can't replace my laptop screen with greyscale because I need syntax highlighting for programming.


Maybe the $100 nano-texture screen will give you the visibility you want. Not the low power of a epaper screen though.

Hmm, emacs on an epaper screen might be great if it had all the display update optimization and "slow modem mode" that Emacs had back in the TECO days. (The SUPDUP network protocol even implemented that at the client end and interacted with Emacs directly!)


QD-oled reduces it by like 25% I think? But maybe that will never be in laptops, I'm not sure.


QD-OLED is an engineering improvement, i.e. combining existing researched technology to improve the result product. I wasn't able to find a good source on what exactly it improves in efficiency, but it's not a fundamental improvement in OLED electrical→optical energy conversion (if my understanding is correct.)

In general, OLED screens seem to have an efficiency around 20≈30%. Some research departments seem to be trying to bump that up [https://www.nature.com/articles/s41467-018-05671-x] which I'd be more hopeful on…

…but, honestly, at some point you just hit the limits of physics. It seems internal scattering is already a major problem; maybe someone can invent pixel-sized microlasers and that'd help? More than 50-60% seems like a pipe dream at this point…

…unless we can change to a technology that fundamentally doesn't emit light, i.e. e-paper and the likes. Or just LCD displays without a backlight, using ambient light instead.


Is the iPad Pro not yet on OLED? All of Samsung's flagship tablets have OLED screens for well over a decade now. It eliminates the need for backlighting, has superior contrast and pleasant to ise in low-light conditions.


The iPad that came out today finally made the switch. iPhones made the switch around 2016. It does seem odd how long it took for the iPad to switch, but Samsung definitely switched too early: my Galaxy Tab 2 suffered from screen burn in that I was never able to recover from.


LineageOS has an elegant solution for OLED burn in: imperceptibly shift persistent UI elements my a few pixels over time


I'm not sure how OLED and backlit LCD compare power-wise exactly, but OLED screens still need to put off a lot of light, they just do it directly instead of with a backlight.


we considered which one adheres to the prompt more, which one has overall best aesthetics etc but ended up with a simple which one is overall better type question. it is easier for people to vote and decide one and still applicable as preference data at a larger scope (trading volume for simplicity).

the dataset is open source and we plan to train an aesthetics picker on it but obviously have to do proper evals (with at least 1M data) to come to a reasonable conclusion.


As feedback I can tell you it's not easy at all for me to decide which one is better. Maybe if the prompts would include the art style I would be able to clearly identify the better ones, but they don't. Style is where I see most of the differences.

Disclaimer: I only clicked on Surprise me.


comparing it to lmsys chatbot arena, what sort of an option would you expect? the prompts essentially come from public HF datasets like parti prompts where they test a bunch of stuff (prompt adherence, attention mapping [something in front of something else etc], aesthetics, photo-realism, etc.) so it is hard to ask about each category.


The question is ok but I need to have a clear input for me to decide which one is better. For example: A serene forest night, a lamp-lit path leads to a cozy wooden house. It comes up with a very detailed almost photorealistic image of the scene, while also bringing up a very well painted one. What do I choose? The input didn't mention anything about the style so it's very hard for me to pick a winner unless (like I said) I'm incredibly subjective.


i see about that case, and yeah you are right. we probably need realistic/artistic tags as you mentioned. thanks for the example! we'll probably include something like that in the next release and group models by ELO on different categories (can be considered like language analogue)


Glad I could help!


How transferable the open source experience from major projects (like being a core developer of the python language itself) in terms of O1 to provide the criteria of "reviewing other's work"?


Very transferable and also helpful for a green card application.


Pretty neat implementation. In general, for these sort of exercises (and even if the intention is to go to prod with custom kernels) I lean towards Triton to write the kernels themselves. It is much more easier to integrate to the tool chain, and allows a level of abstraction that doesn't affect performance even a little bit while providing useful constructs.


yeah even the official flashattention is moving many implementations from cutlass to triton except for the main mha backward/forward pass


It was written with cutlass? No wonder Peter Kim found it valuable and worthwhile to de-obfuscate. Adopting a new programming language invented by OpenAI doesn't sound like a much better alternative. I'd be shocked if either of them were able to build code for AMD GPUs, where it's easy to adapt CUDA code, but not if it's buried in tens of thousands of lines of frameworks. I like open source code to have clarity so I can optimize it for my own production environment myself. When people distribute code they've productionized for themselves, it squeezes out all the alpha and informational value. Just because something's open source doesn't mean it's open source. I think people mostly do it to lick the cookie without giving much away.


Triton has an AMD backend, although work is still ongoing.


You will also be able to use Triton to target Ryzen AI.


As a person who finds CUDA extremely easy to write and integrate, what does Triton have to offer?


block level rather than thread level programming, automatic optimization across hyperparameters, makes it much easier to write fast kernels


You mean triton the inference server or triton the DSL for cuda?



they mean the dsl (not just necessarily for cuda)


triton the DSL.


> allows a level of abstraction that doesn't affect performance even a little bit

The second part of this sentence is true because the first part is false.


zero cost abstractions exist. doesn't mean all abstractions are zero-cost. or being zero-cost somehow invalidates their abstractness/genericness. but maybe we differ on the definition of abstractions.


> zero cost abstractions exist

So does perpetual motion :shrug: but my point is Triton is not an abstraction in the least. Source: 1) I spent 6 months investigating targeting other backends 2) Phil himself said he doesn't care to support other backends https://github.com/openai/triton/pull/1797#issuecomment-1730...


It's amazing how heavily provided hn is. I have a response here that's been deleted that is like 15 words, including a link to source that corroborates my claim but that response contains a transcribed emoji and so it's been deleted by dang or whomever. Lol super rich environment for discourse we've got going here.


Latency is really important and that is honestly why we re-wrote most of this stuck ourselves but the project with the gurantee of 25ms< looks interesting. I wish there was an "instant" mode where enough workers are available it could just do direct placement.


To be clear, the 25ms isn't a guarantee. We have a load testing CLI [1] and the secondary steps on multi-step workflows are in the range of 25ms, while the first steps are in the range of 50ms, so that's what I'm referencing.

There's still a lot of work to do for optimization though, particularly to improve the polling interval if there aren't workers available to run the task. Some people might expect to set a max concurrency limit of 1 on each worker and have each subsequent workflow take 50ms to start, which isn't be the case at the moment.

[1] https://github.com/hatchet-dev/hatchet/tree/main/examples/lo...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: