Hacker Newsnew | past | comments | ask | show | jobs | submit | schwag09's commentslogin

This is correct, it's even open source: https://github.com/deepfield/dnsflow.


This article briefly mentions a very useful analysis tool for NGINX configuration: Gixy.

It looks for the following misconfigurations[0]:

  - [ssrf] Server Side Request Forgery
  - [http_splitting] HTTP Splitting
  - [origins] Problems with referrer/origin validation
  - [add_header_redefinition] Redefining of response headers by "add_header" directive
  - [host_spoofing] Request's Host header forgery
  - [valid_referers] none in valid_referers
  - [add_header_multiline] Multiline response headers
  - [alias_traversal] Path traversal via misconfigured alias
The alias traversal gotcha is one of the most pernicious I've seen. A single, seemingly innocuous '/' is the difference between a path traversal vulnerability or not.

[0]: https://github.com/yandex/gixy#what-it-can-do


Interesting tool. This looks like the Java equivalent of Facebook's Python taint analysis tool Pysa: https://pyre-check.org/docs/pysa-basics/.

From what I can tell by the documentation, it looks like Mariana's requires you to bring your own sources/sinks/sanitizers, so expect a lot of up front cost to integrate this into your toolchain. This as opposed to including commonly used rules or heuristics. Not a huge deal since users can write and share there own rules, but this looks like a framework for sophisticated static analysis and not a batteries included solution.


It literally says on the `Getting started` that it is similar to Pysa.


Good catch. I went straight to "Documentation", which links to https://mariana-tren.ch/docs/getting-started, while the "Getting Started" button somewhat confusingly links to https://mariana-tren.ch/docs/overview.

After a deeper dive I also noticed that my second statement about "batteries included" isn't totally true. Digging around in the Github repository I found a dozen or so heuristics here: https://github.com/facebook/mariana-trench/tree/main/configu.... It'll be cool to watch this fill out a bit.


It is the latest system in that same family - more details here: https://engineering.fb.com/2021/09/29/security/mariana-trenc...


> In computing, the relationship between structure and behavior, between program and process, is perplexing in itself. That this relationship so often can be subverted, allowing an untrusted data provider to preternaturally gain control over program execution, is disquieting. Why is this a common phenomenon in computer systems?

Mixing code and content leads to all the most common injection-style vulnerability classes: buffer overflows, XSS, SQLi, command injection, etc. Fundamentally, I believe this is at the heart of the problem. The paper does go on to address this:

> Another proposed theory blames John von Neumann's stored program concept [15] for the woes of arbitrary code execution; the fact that data and program in computers are stored on the same storage medium may allow an attacker to illegitimately modify the program rather than the intended data [10].

From there it provides ROP as a counterpoint to the "stored-program hypothesis." This makes sense because ROP allows one to achieve arbitrary code execution without necessarily modifying the code. Although, while I agree that mixing code and content may not be the highest level abstraction for concern here, I do think it's often a fundamental flaw from a practical perspective.


Your fundamental flaw is my prized feature.

Metaprogramming/reflection would be impossible without treating code as content.


Let me rephrase, I think it's often a fundamental flaw from a practical, security perspective. Reflection and constructs like 'eval' are often at odds with security. You could more generally say that utility is often at odds with security.

Maximizing utility on a website might look like dropping a user into a root REPL so they can perform arbitrary actions and are unlimited in their capabilities. Maximizing security might look like shutting down the website and all its associated servers. In reality, from a security perspective, we try to achieve a balance between the two by maximizing utility while minimizing security risk.

Personally, I think code as content, reflection, eval, etc move the slider too far in the utility direction. Of course this is a difficult problem to solve because nearly all our systems are built on the idea of code as content, which is also what makes it such a generalized vulnerability class.


And this tension in goals is not limited to software. Modifying and repairing hardware is similarly fraught. You can get yourself into all kinds of trouble by tinkering with (say) your car.

The real difference is that you generally have to be in physical proximity to hardware in order to tinker with it, and that puts a very hard constraint on random people mucking with your car. Not so for software, especially in the age of the internet.


Not at all, even in Lisps there’s quoting. In-band signalling isn’t necessary for either meta programming or reflection.


Saying the same thing from yet another perspective:

What you call in-band - I call data. What you call out-of-band - I call code.

What I call metaprogramming is erasing the distinction.


Signalling is the same thing as mutability. To signal is to flip a bit.

If code is data and data is immutable then code is immutable.

I am sure you see the contradiction in objectives.


This reminds me of the parable of the Mexican fisherman and the Harvard MBA:

'An American investment banker was at the pier of a small coastal Mexican village when a small boat with just one fisherman docked. Inside the small boat were several large yellowfin tuna. The American complimented the Mexican on the quality of his fish and asked how long it took to catch them.

The Mexican replied, “only a little while.”

The American then asked why didn’t he stay out longer and catch more fish?

The Mexican said he had enough to support his family’s immediate needs.

The American then asked, “but what do you do with the rest of your time?”

The Mexican fisherman said, “I sleep late, fish a little, play with my children, take siestas with my wife, Maria, and stroll into the village each evening where I sip wine, and play guitar with my amigos. I have a full and busy life.”

The American scoffed. “I have an MBA from Harvard, and can help you,” he said. “You should spend more time fishing, and with the proceeds, buy a bigger boat. With the proceeds from the bigger boat, you could buy several boats, and eventually you would have a fleet of fishing boats. Instead of selling your catch to a middle-man, you could sell directly to the processor, eventually opening up your own cannery. You could control the product, processing, and distribution,” he said. “Of course, you would need to leave this small coastal fishing village and move to Mexico City, then Los Angeles, and eventually to New York City, where you will run your expanding enterprise.”

The Mexican fisherman asked, “But, how long will this all take?”

To which the American replied, “Oh, 15 to 20 years or so.”

“But what then?” asked the Mexican.

The American laughed and said, “That’s the best part. When the time was right, you would announce an IPO, and sell your company stock to the public and become very rich. You would make millions!”

“Millions – then what?”

The American said, “Then you could retire. Move to a small coastal fishing village where you could sleep late, fish a little, play with your kids, take siestas with your wife, and stroll to the village in the evenings where you could sip wine and play guitar with your amigos.”'


I suspect this means a scanner that can derive all necessary information without any configuration. For example, consider a scanner looking for API endpoint authorization inconsistencies. Does the scanner need you to describe your authorization scheme, or can you simply run the thing and it figures it out?

This can be easy or hard depending on how bespoke your application is. If you're using something like Ruby on Rails, then there's a paved road that a scanner can preconfigure to understand your application. If you're using a homegrown authorization framework, then a scanner will likely have a hard time understanding your application and will need to be configured.


That's an interesting approach. It incentivizes users to have a backup plan while also providing an escape hatch if things go wrong. The only issue is, like you alluded to, this could price out a large portion of the world's developer population. Perhaps the price could be determined by where you are in the world, although this may not cover US support costs. That and users could attempt to game the system by faking their location. Regardless, it's an interesting approach I hadn't considered before - I like it.


It's great to see more introductory ReDoS material! I took a deep-dive on ReDoS myself recently and found the material available to be somewhat lacking, especially for beginners. Over the course of my investigation I found some interesting bugs in big name projects and created a few blog posts as an introduction to the bug class:

* https://blog.r2c.dev/2020/finding-python-redos-bugs-at-scale...

* https://blog.r2c.dev/2020/improving-redos-detection-with-dli...

The culmination of this work was a Python regex linter that could automatically detect ReDoS expressions with fairly high accuracy - Dlint's DUO138 rule: https://github.com/dlint-py/dlint/blob/master/docs/linters/D....

In my opinion, the best solution, as this article mentions, is to avoid the bug class altogether by using something like RE2 when possible. Nevertheless, I found ReDoS to be a really cool bug class at the intersection of computer science and software engineering.


At one point in time I created a Python package to highlight this benefit of wheels: "Avoids arbitrary code execution for installation. (Avoids setup.py)" - https://github.com/mschwager/0wned

Of course Python imports can have side-effects, so you can achieve the same results with 'import malicious_package', but the installation avenue was surprising to me at the time so I created a simple demo. Also consider that 'import malicious_package' is typically not run as root whereas 'pip install' is often run with 'sudo'.


I thought the rule was never run `sudo` with `pip install` or you'll screw up the permissions on your system.


For other reasons, it is just as bad when using python from Homebrew on macOS where sudo isn't necessary. `pip install` will install modules in `/usr/local` where they will get mixed with Homebrew-provided python packages. I was hoping there would be a way to make `pip install --user` the default, but I couldn't figure it out the last time I checked.


This is exactly why you want to do all (as in 100%) of your python work in a virtual environment, so the packages are completely isolated in your ~/.virtualenvs/[ENVNAME]/lib/pythonx.x/site-packages.

Never, ever, do a pip install in your non virtualenv environment.


If you're using a *nix system, you could create an alias. Something like, `alias "pip install" "pip install --user"`


I had the same thought. Although it looks like Signal uses a proxy for GIPHY requests [1] and has at least thought about the privacy implications of GIPHY support [2].

[1] https://github.com/signalapp/Signal-Android/issues/9628

[2] https://signal.org/blog/signal-and-giphy-update/


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: