I know a little bit about relational databases like MySQL. What are noSQL databases? How do they work and how does that differ from relational databases?

(nosql is blazingly fast when quering for a certain data set no matter how big the table is, while sql quickly runs into bottlenecks when your table is like 500 gb big and the index tables for that table are even bigger (when having multiple indexes for example)

That's the theory at least. In practice, relational databases allow you to make the exact same feature tradeoffs for performance, without forcing you to do so. You're not forced to use joins, transactions and referential integrity constraints, but you can when you need them. You can denormalize your datamodel just as well in a relational DBMS. Some of them even facilitate this by supporting materialized views - views where the query result is cached on disk, so you can have the performance of queries on denormalized tables, while still letting the database enforce referential integrity.

There are some special purpose use cases for which a particular non-relational database may be better suited than a general purpose RDBMS, but don't take such broad statements on nosql vs sql too seriously, especially when you find them on the mongodb website. MongoDB is one of the 'biggest' general purpose NoSQL databases right now, and in my opinion, as it is now it demonstrates why there isn't really a use case where a general purpose nosql database is better than a general purpose RDBMS. In the stated case of a single 500gb table for example, you'd need more than 500gb of RAM to get good read performance out of mongodb, and unless you use the wiredtiger storage engine (released spring this year) you'd get absolute crap write performance due to database level write locking.

/r/askscience Thread Parent