Yeah, our basic integration test suite takes over 20 minutes to run in CI, likely higher locally but I never try to run the full test suite locally. That doesn't even encapsulate PDVs and other continuous testing that runs in the background.
The other day, I wrote a claude skill to pull logs for failing tests on a PR from CI as a CSV for feeding back into claude for troubleshooting. It helped with some debugging but was very fraught and needed human guidance to avoid going in strange directions. I could see this "fix the tests" workflow instrumented as overnight churn loops that are forbidden from modifying test files that run and have engineers review in the morning if more tests pass.
Maybe agentic TDD is the future. I have a bit of a nightmare vision of SWEs becoming more like QA in the future, but with much more automation. More engineering positions may become adversarial QA for LLM output. Figure out how to break LLM output before it goes to prod. Prove the vibe coded apps don't scale.
In the exercise I described above, I was just prompt churning between meetings (having claude record its work and feeding it to the next prompt, pulling test logs in between attempts), without much time to analyze, while another engineer on my team was analyzing and actually manually troubleshooting the vibe coded junk I was pushing up, but we fixed over 100 failing integration tests in a week for a major refactor using claude plus some human(s) in the loop. I do believe it got things done faster than we would have finished without AI. I do think the quality is slightly lower than would have been if we'd had 4 weeks without meetings to build the thing, but the tests do now pass.
You can also run commands when a user authenticates, grab their keys from github.com/username.keys, validate they're a user in a specific github group, then let them connect by outputing the keys, otherwise nothing to deny them access.
It's really great for ops teams where you want to give ssh access and manage it from github teams without needing a complex system.
Honest question, why is ProxyCommand `fun`? What do I get out of ProxyCommand that i do not get out of setting the correct order for ProxyJump and doing an ssh finalhost -- domy --bidding?
ProxyJump is a newer functionality. There used to be only ProxyCommand. ProxyJump is a shortcut for the usual way to use ProxyCommand to connect through a bastion host but ProxyCommand is more flexible. For example with ProxyCommand you can run any command to connect to the remote host. ProxyJump only connects over ssh. I think I replaced all my ProxyCommand with ProxyJump because I don't need much else than the normal use case.
You can get a lot more out of ProxyCommand. For example, you can run SSH over non-IP protocols, such as serial, Bluetooth RFCOMM for embedded boards, or vsock for virtual machines without networking set up at all. The latter is built into and setup up automatically by systemd:
ProxyCommand allows you to use any command to setup a connection. Not necessarily an ssh command, like ProxyJump. It can be any command, as long as it receives on stdin and produces on stdout, it can act like a TCP connection.
ProxyJump is a special case of `ProxyCommand ssh -p <port> <user>@<host>`. Can't replace the `ssh` in there when using ProxyJump.
I came across ProxyCommand earlier this week, funnily enough. I have Cloudflare Zero Trust set up with an SSH service[0], and have the server firewall drop all incoming traffic. That helps reduce my attack surface, since I don't have any incoming ports open.
I use ProxyCommand in edge-case devices where key auth is not an option and the password is not controlled by me. ProxyCommand points to a script the retrieves the password from the vault, puts it on the clipboard for pasting, reminds me via stderr it's done so, and then proxies the connection.
Interesting. I might have such a use case. Do you have anything about best practices on how to automate grabbing passwords from vaults? Cuz it seems to me that the vault needs to be kept open or keep the vault password somewhere on disk.
I was thinking it would be nice to have a final print edition for the book collection, Amazon seems to be under the impression that this newer version is coming out in April.
All they'd learn that way is that that phone number has a Signal account, when it was registered, and when it was last active. In other words, it doesn't tell them whether it's part of a given Signal group. (See https://signal.org/bigbrother/.)
They publicly publish these requests. You can see how little information is provided — just a phone number and two unix timestamps IIRC.
https://signal.org/bigbrother/
I might be misremembering or mixing memories but i remember something about them only storing the hash of the number.
So the FBI cant ask what phone number is tied to an account, but if a specific phone number was tied to the specific account? (As in, Signal gets the number, runs it through their hash algorythm and compares that hash to the saved one)
But my memory is very very bad, so like i said, i might be wrong
It would be absolutely trivial for the FBI to hash every single assigned phone number and check which one matches. Hashing only provides any anonymity if the source domain is too large to be enumerable.
You don't even need to think about how the hashing scheme and salt is set up. If Signal can check if a phone number matches the hash in any reasonable amount of time (which is the whole point of keeping a hash in the first place) then the FBI can just do that for all phone numbers with very realistic compute resources once they get Signal to cough up the details of the algorithm and magic numbers used.
For some reason I thought it was open to the public, but France also maintains a full sovereign cloud office suite for use by civil servants: https://lasuite.numerique.gouv.fr/en
You might need to disclose social media accounts, phone numbers, email accounts, and a lot of other information, regardless of your burner: https://www.bbc.com/news/articles/c1dz0g2ykpeo
Depends on when that goes into effect and how thoroughly it's actually implemented.
Yeah, even those looking for the full segment will have trouble finding it if they are not tech savvy and highly motivated.
A relative in their 60s saw headlines about the cancellation and wasn’t able to find it until I sent them the archive.org link. They are relatively well informed and competent with technology but never go around digging for hard to find media.
I think people on HN tend to overestimate how closely people follow news and how hard they are willing to work to seek out alternative sources of information. I’m with some extended family over the holidays. They might have seen this segment had it aired - I believe it was airing after some football game - but now there’s no chance of that happening. I don’t judge them for it at all, but most of their news consumption is passive through TV or social media. I think a lot of people follow news that way. Life’s busy.
It kind of makes me understand a little better how the censorship regime in other countries is so effective despite it being so easy to hop on a VPN. Raising the barrier to entry even a little reduces the audience from 10,000,000 to a fraction of that, even with the censorship itself being public knowledge.
The other day, I wrote a claude skill to pull logs for failing tests on a PR from CI as a CSV for feeding back into claude for troubleshooting. It helped with some debugging but was very fraught and needed human guidance to avoid going in strange directions. I could see this "fix the tests" workflow instrumented as overnight churn loops that are forbidden from modifying test files that run and have engineers review in the morning if more tests pass.
Maybe agentic TDD is the future. I have a bit of a nightmare vision of SWEs becoming more like QA in the future, but with much more automation. More engineering positions may become adversarial QA for LLM output. Figure out how to break LLM output before it goes to prod. Prove the vibe coded apps don't scale.
In the exercise I described above, I was just prompt churning between meetings (having claude record its work and feeding it to the next prompt, pulling test logs in between attempts), without much time to analyze, while another engineer on my team was analyzing and actually manually troubleshooting the vibe coded junk I was pushing up, but we fixed over 100 failing integration tests in a week for a major refactor using claude plus some human(s) in the loop. I do believe it got things done faster than we would have finished without AI. I do think the quality is slightly lower than would have been if we'd had 4 weeks without meetings to build the thing, but the tests do now pass.
reply