SJSU CS157A DBMS1 Dr. Lin 2018-08-28T

Go to the DB1 project page: http://xanadu.cs.sjsu.edu/~drtylin/classes/cs157A/Project/DB1/
Download and run the create_FIRST1a.rtf file http://xanadu.cs.sjsu.edu/~drtylin/classes/cs157A/Project/DB1/create_FIRST1a.rtf on your DB.
Then download and run "populate_FIRST1a .rtf" http://xanadu.cs.sjsu.edu/~drtylin/classes/cs157A/Project/DB1/populate_FIRST1a%20.rtf, the one that has a space in the name, because the one without a space in the name has "FIRST1a_ID_SEQ.nextval" in the queries, which the table creation file "create_FIRST1a.rtf" doesn't create.

Then execute these commands in turn and review the results:

select sum(salary) from first1a;
-- remember, this poor guy got cancer from the company's parts he was handling
delete from first1a where snum="s3" and pnum="p2";
select sum(salary) from first1a;

Dr. Lin said that this employee wanted time off but also should still be paid. I think the point he was trying to show here is that by removing him from the list of employees who are available to continue working in the company, because of the poor schema design, that he was also removed from sum(salary), so now we don't know how much the company is paying its employees because the database design can't properly represent that.

However, I believe Dr. Lin introduced a mistake when he used an aggregate function "sum(salary)", because each employee's salary is getting counted each time for the number of parts that employee handles. What we really want to do is see each employee's salary once, and then sum up the results.

See each employee's salary once:

select distinct snum,salary
    from first1a;
+------+--------+
| snum | salary |
+------+--------+
| S1   |  40000 |
| S2   |  30000 |
| S3   |  30000 |
| S4   |  40000 |
| S7   |  60000 |
+------+--------+

Then, from that table, select sum(salary):

select sum(salary)
    from (
        -- note that this is the query from above
        select distinct snum,salary
            from first1a
    ) table_alias_1;
+-------------+
| sum(salary) |
+-------------+
|      200000 |
+-------------+

What this shows is that you can not only select from tables in the database, but you can also select from the results of a select statement, which is called a subquery. Note that "table_alias_1" is just a necessary part of the MariaDB syntax - without naming it that, an error would appear: "ERROR 1248 (42000): Every derived table must have its own alias".

Note that the ") table_alias_1" could have just as easily read ") AS table_alias_1". "AS" is optional. However, table_alias_1 is not a very good name for the subquery. Sometimes it's difficult to think of a good name for a subquery so we end up naming them things like "step1", "step2", etc., but in this case, an apt name could be "employee_salaries" or "distinct_employee_salaries".

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.