0:03
Let me give you a quick overview of MongoDB.
Why is MongoDB interesting.
How is it useful for our application and what are some of the salient features of
MongoDB in contrast to traditional SQL databases.
So this will not be an entire treaty's of databases.
I assume that you have sufficient knowledge of databases.
So, what I would introduce what MongoDB would be easy for you to follow.
From your prior knowledge of databases,
I assume that you already understand that databases are used to store structured data and
also enable you to perform various operations on the data including creating the data,
inserting records into the database,
updating an existing record in the database or deleting a record from the database.
The typical kind of operations that are supported on databases.
Structured query language or SQL-based databases have been
very popular for a long time as a means of storing data.
MySQL is one example of SQL-based database.
They have been very effective in
storing data and then addressing many of the needs of applications.
Indeed, many websites already use SQL databases as the backend for storing data.
Given that, why is NoSQL databases
import with new kind of applications that are coming online.
There is an increasing demand for
new features not all of which the SQL-based databases can address.
So this is where the NoSQL based database are not only
SQL-based database are gaining a lot of gut.
MongoDB being one example of that.
So the NoSQL databases are designed to
address some of the shortcomings of SQL-based databases.
The NoSQL databases themselves can be classified into four different categories.
We have document-based databases like MongoDB.
We have the more simpler key-value based databases like Redis,
column-family based databases like
Cassandra and then the newer graph databases like Neo4J.
And indeed, there are more now in the market than these examples that I have given.
But of course in this course,
we will be concentrating primarily on document-based databases, MongoDB in particular.
So I will review more about MongoDB in the rest of this lecture.
Document databases as the name implies are built around documents.
A document is a self-contained unit of information and can be in many different formats.
JSON being one of the most popular formats for storing documents in a document database.
As an example, a JSON document is shown here and
this would be something that are be stored in a typical document database.
Documents themselves can be organized into collections.
So a collection is a group of documents.
And in turn, the database itself can be considered as a set of collections.
So these terms, documents,
collections and the database will occur frequently when we
discuss about document databases and MongoDB in particular.
Why are NoSQL databases have interest to us.
In particular, scalability is one of
the reasons why NoSQL databases have shined very well.
Now, in terms of scalability when we look at the two requirements,
availability and consistency of the databases,
typically SQL databases find it very
difficult to meet both the requirements simultaneously.
So that is a trade off between availability and consistency.
So this is where NoSQL databases had been lot
more successful at meeting both the requirements.
This is where the third aspect highlighted here,
partition tolerance also comes in effect.
Now, partitioning a SQL database and then distributing it is not as straightforward.
There as, NoSQL-based database is lot more amenable
to being subdivided and then distributed across multiple servers.
The second aspect of why NoSQL databases have been popular is ease of deployment.
When you use an SQL database,
there is a need for matching the records in your SQL database back
to objects in your native language like Java or JavaScript and so on.
So this is a need for object relation mapping and this is
where intermediate gateway needs to fill in this requirement.
With a NoSQL database like a document based database storing data in the form of JSON,
the mapping becomes quite straightforward and that is one of the reasons why
NoSQL databases have been very popular in the web development area.
Coming to MongoDB in particular,
MongoDB is a document database.
The server itself can support multiple databases.
A database in particular is a set of collections,
and the collection itself as we discussed earlier is a set of documents.
So the document becomes the unit of information in case of MongoDB.
The document in MongoDB is nothing but a JSON document.
In fact, MongoDB stores the document in a more compact form called as the BSON format.
We'll talk about that in the next slide.
While MongoDB is a document-based database,
it stores the JSON documents in a compact form
called as the BSON format are the binary JSON format.
Now, this supports length prefix on each value so
that skipping over a field becomes lot more easier.
So as you see, MongoDB supports additional features than a simple document database.
The information about the type of a field value is also stored.
And in addition, within the JSON document,
additional primitive types are stored which are
useful when you are performing operations on the database.
Things like the UTC date format,
it also supports raw binary and also uses an object ID format
for storing the ID of each document in the database if we choose to.
Let's talk about that in a bit more detail in the next slide.
Let's talk about the MongoDB ObjectId.
Every document in a MongoDB database must have an ID field,
an "_id" field which acts as the primary key for the document.
And this field is unique for each document.
The ID field itself can be used in
many formats and one particular format that MongoDB automatically
assigned in case you don't choose to use your own ID field
is the ObjectId that is created by default by MongoDB.
So the ObjectId itself is
a structured piece of information that is stored as the id of the document.
As an example, the ID field that is automatically
assigned by Mongo in case you don't specify an ID field,
contains the ObjectId in the form of a long string.
Now this string has a specific format which
enables this tool a number of pieces of information within the ObjectId.
Let's look at the structure of the ObjectId itself in the next slide.
As I mentioned, the ObjectId field itself is
a 12 byte field which stores information in a specific format.
The first four bytes includes a time stamp,
the typical Unix time stamp in the resolution of a second.
So this is stored in the first four bytes.
Then the next three bytes towards the machine ID,
The machine on which the Mongo server is running,
and the next two bytes is the process ID,
the specific Mongo process which has created this document,
and then the last field is an increment.
Now as you understand,
the time stamp field itself is the resolution of a second.
So if you have multiple documents that are stored within the same second,
then the increment field will distinguish among the documents.
The increment field is a self-incrementing field so
each new document created within a second will get a new increment value.
So combined together these two,
you can easily distinguish between
different documents that are stored within your document database.
So this enables you to clearly give a unique ID to each document.
Not only that, given an ID,
you can easily retrieve information from this ID.
So for example, you can get hold of the ObjectId and then call
the getTimestamp method of the ObjectId and
those will return the time stamp in the ISO date format.
So that'll enable you to identify when this document has been created.
With this quick understanding of MongoDB,
let's proceed on to the exercise where we'll be able
first install MongoDB on our computer and thereafter,
interact with the MongoDB database using the Mongo REPL,
the read, evaluate, print,
loop that Mongo supports.
Now thereafter, we will look at how we can access
the Mongo server from within our Nord application in the next lesson.