- A very large collection (of files) of organized and related data.
- A software package/system that can be used to store, manage and retrieve data from databases that persist for long periods of time!
- Database System
- DBMS + data [+ applications]
- Data Model
- A framework for describing data, data relationships, data semantics, and data constraints.
- The structure of the database using a particular model.
It has two layers.
- Conceptual or Logical Level: describes data stored in database, and their relationships among the data.
- Physical Level: describes how files and records are stored.
- Relational Model
- It is a database model based on first-order predicate logic. Its core idea is to describe a database as a collection of predicates over a finite set of predicate variables, describing constraints on the possible values and combinations of values.
- Entity Relationship Model
- In software engineering, an entity–relationship model (ER model) is a data model for describing the data or information aspects of a business domain or its process requirements, in an abstract way that lends itself to ultimately being implemented in a database such as a relational database.
- Hierarchical Model
- In this model data is organized into a tree-like structure, implying a single upward link in each record to describe the nesting, and a sort field to keep the records in a particular order in each same-level list.
- Relational Database
- A set of relations
- Made up of 2 parts: Schema and Instance
- specifies name of relation, plus name and type of each column
- the actual data at a given time. Cardinality: number of rows, Degree/Arity: number of fields
- Super Key
- A set of fields is a superkey if no two distinct tuples can have same values in all key fields.
- A set of fields is a key for a relation if it is a superkey and no subset of the fields is a superkey. One such key is chosen as Primary Key, others (if exist) are called Candidate Keys.
- Foreign Key
- A set of fields in a tuple in one relation that refer to a tuple in another relation. It reference to primary key of the other relation.
The SQLite project provides a simple command-line utility named sqlite3 (or sqlite3.exe on windows) that allows the user to manually enter and execute SQL statements against an SQLite database. [ref]
We will use this for SQL query exercise.
- Download the CLI from http://www.sqlite.org/download.html, select precompiled binaries for your operating system.
- Download movie.sqlite, our database to run queries upon. [from Dave]
Here, we are working on a IMDB database, containing information about movies, actor/actress, directors, Oscar awards info, etc,. In our database, the Movie relation has the following attributes:
|Attribute Name||Data Type||Description|
|id||CHAR(7)||a unique id assigned to the film|
|name||VARCHAR(64)||the name of the film|
|year||INTEGER||the year the film was released|
|rating||VARCHAR(5)||the MPAA rating of the film; if the film is unrated or has a non-standard rating, this value is NULL.|
|runtime||INTEGER||the running time of the film in minutes|
|genre||VARCHAR(16)||the genre(s) of the film, formed by concatenating one-letter genre codes. no assumptions should be made about the order in which these codes appear in a given genre string|
|earnings_rank||INTEGER||the earnings rank of the film (only included for the top 200; NULL otherwise)|
And part of the actual table looks like this.
|1300854||Iron Man 3||2013||PG-13||130||A||13|
|1690953||Despicable Me 2||2013||PG||98||N||24|
|0770828||Man of Steel||2013||PG-13||143||A||54|
|1623205||Oz the Great and Powerful||2013||PG||130||V||90|
|1408101||Star Trek Into Darkness||2013||PG-13||132||A||96|
|0816711||World War Z||2013||PG-13||123||A||135|
This is actually generated by the following SQL query, FYI.
select * from movie order by year desc limit 10;
- Write down the schema of this table according to the description.
- Explain the meaning of those datatype keywords, e.g.
- Discussion: Why do we choose these datatypes? Can we choose other types instead?
- Based on these 10 example records, what field or set of fields must not be a super key?
- Based on these 10 example records, what field or set of fields might be a super key?
- Based on these 10 example records, what field or set of fields might be a candidate key?
- Based on our precise definition, what’s our primary key? What about candidate key?
- Try to execute the example query in SQLite.
Please open the
movie.sqlitedatabase using the following command,
bash ./sqlite3 movie.sqliteIn SQLite command line interface, you can type
.helpfor help information.
- With the help information, try to display all the tabls schemas in the database. And then try to display only the schema info for
- Run an query to show the name and earnings rank of top 30 grossing movies in 2013.
- Show name and year of animations who runs over one hour and a half.
Write down the schema of this table according to the description.
sql movie (id char(7), name varchar(64), year integer, rating varchar(5), runtime integer, genre varchar(16), earnings_rank integer)
Explain the meaning of those datatype keywords, e.g.
Run an query to show the name and earnings rank of top 30 grossing movies in 2013.
select name, earnings_rank from movie where year = 2013 and earnings_rank < 30;
Show name and year of animations who runs over one hour and a half.
select name, year from movie where genre like "%N%" and runtime > 90;
SQL Dialect: http://en.wikibooks.org/wiki/SQL_Dialects_Reference