-
Notifications
You must be signed in to change notification settings - Fork 0
/
notes.txt
695 lines (686 loc) · 28.2 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
Session 1 notes - introduction
- big data = structured or unstructured data difficult to process using standard techniques
- capture > store > analyse
- database = data storage
- SQL = for data access and processing
- examples of common streams
- software engineering: programming and development
- data analysis: data science and processing
- data science steps
- frame the problem to solve
- collect and store raw data
- process the data
- explore the data
- in-depth analysis
- communicate and visualise results
- database
- multiple tables within database, can relate to one another
- relational database model RDBM
- stores data in tables, which relates pieces of data to one another
- tables
- consist of rows (often called records) and columns
- junction table: combinations of other tables, e.g. which employee is in which department,
cannot exist without employee or department tables
- need to specify constraints per column: type of data (mandatory), size of data, format,
required (not null) or not (null)
- NULL value - process cell without data without problems, must explicitly include
- data types (more info here: https://www.w3schools.com/sql/sql_datatypes.asp)
- char(size) i.e. fixed length string, can be from 0 to 255, 1 by default
- varchar(size) i.e. variable length string, can be from 0 to 65535
- binary
- float(p) where p is precision
- decimal(m,d) where m is total digits and d is digits after decimal
- boolean where 0 is FALSE and all other values are TRUE
- date i.e. YYYY-MM-DD
- time i.e. HH:MM:SS
- timestamp i.e. YYYY-MM-DD HH:MM:SS
- attribute = property of an object, e.g. data hired, date of birth, can identify using an attribute
- many to many relationships allowed, e.g. can have many salaries to account for pay rise
- https://coderpad.io/sandbox
- relational database management system RDBMS
- program to create, update and administer a relational db_name
- SQL (Structured Query Language) most commonly used language
- MySQL
- most popular open source SQL db
- easy to use, inexpensive, reliable, large community
- poor performance when scaling, open source dev slow, missing some advanced features
- controlled by Oracle
- PostgreSQL
- open source, not controlled by any corporation, used for web app dev
- easy to use, inexpensive, reliable, large community, some additional features
- slower performance, not as commonly offered as managed instance
- Oracle DB
- close source (Oracle)
- large apps (banking - prebuilt functionality)
- expensive
- SQL Server
- close source (Microsoft)
- large enterprise apps
- expensive
- SQLite
- open source
- can store an entire db in a single file so data can be stored locally e.g. db in mobile phone
- https://coderpad.io/sandbox to practice online
- data in db
- should be singular source of truth
- should be easy to query
- want to manage and maintain data efficiently
- want to persist data and keep it safe
- query syntax
keyword identifier keyword identifier;
- SELECT syntax
- SELECT column_name1, column_name2 FROM table_name;
- SELECT * FROM table_name; where * is a wildcard character to select all columns from a table
- SELECT table.column1, table.column2 FROM table;
- SELECT t.column1, t.column2 FROM table t;
- can alias a table using ... FROM table t;
- CREATE DATABASE and USE db syntax
- CREATE DATABASE db_name; not SQL standard
- USE db_name;
SELECT * FROM table_name;
- SELECT * FROM db_name.table_name;
- CREATE TABLE syntax
- CREATE TABLE table_name (column1 INTEGER, column2 VARCHAR(50), column3 VARCHAR(50), column4 INTEGER);
define name and type at min
- INSERT INTO syntax
- populate tables with data, i.e. add new rows of data
- keep names clear and user-friendly
- INSERT INTO table_name (col1, col2, col3) VALUES (val1, val2, val3);
- INSERT INTO syntax pt 2
- insert multiple rows with one statement
- INSERT INTO table_name (col1, col2, col3) VALUES (val1,val2,val3), (val4,val5,val6), (val7,val8,val9);
- INSERT INTO table_name (col1, col2, col3) SELECT * FROM old_table_name WHERE old_table_name.col>10;
- homework task 1
- review slide in reference materials - done
- write a paragraph about what I want to do with SQL - done
- homework task 2
- create sweet and savoury tables in bakery database - done
- create parts database - done
##############################################################################################################
Session 2 notes - SQL and db management techniques
- DISTINCT
- keyword requesting a unique set of results
- SELECT DISTINCT alias.column_name FROM table_name alias;
- WHERE
- checks for conditions to match for results to qualify
- comes after FROM
- Boolean expression, such as =, !=, >, < etc.
- SELECT alias.column_name FROM table_name alias WHERE alias.column_name = 'value';
- database normalisation
- create table about one specific topic only (nonstructured data -> structured data)
- benefits
- minimise duplicate data
- minimise/avoid data modification issues
- simplify queries
- i.e. reduce data redundancy and improve data integrity
- + reduce data complexity, ensure relationship between tables as well as data,
ensure data dependecies and data are logically stored
- common forms
- 1st (or 1NF)
- contains only automic values (i.e. one value per cell)
- each record is unique, no repeating groups (i.e. table doesn't contain 2 or more
values of columns that are closely related)
- 2nd (or 2NF)
- 1NF
- no partial dependency of any column on primary key
- all non-key attributes are fully functionally dependent on primary key
- 3rd (or 3NF)
- 2NF
- no transitive functional dependency, i.e. table 1 dependent on table 2 and table 2
dependent on table 3 is transitive dependence, which isn't allowed to occur
- benefits: data integrity, no data duplication
- Boyce-Codd (or BCNF or 3.5NF)
- used to handle anomalies which are not accounted for in 3NF
- all redundancy based on functional dependecy removed
- SQL constraint types
- constraints = rules to restrict type of data in a table
- ensure accuracy and reliability of data in table
- can be set at column or table level
- constraint violation > action aborted
- CREATE TABLE table_name (col1 Type, col2 Type, col3 Type,
CONSTRAINT contstraint_name constraint_type (col_that_it_applies_to));
- CREATE TABLE customers (customer_id INTEGER, name VARCHAR(50),
surname VARCHAR(50) NOT NULL, telephone INTEGER CONSTRAINT pk_customer_id PRIMARY KEY (customer_id));
- NOT NULL
- once applied, cannot pass a null value to the column
- cannot be defined at table level
- CREATE TABLE table_name (col1 INTEGER NOT NULL, col2 VARCHAR(50));
- UNIQUE
- column must only have unique values, i.e. duplicate values not allowed
- can be applied at column or table level
- CREATE TABLE table_name (col1 INTEGER NOT NULL UNIQUE, col2 VARCHAR(50));
- ALTER TABLE table_name ADD UNIQUE (col1);
- CHECK
- restrict value of a column between a range
- CREATE TABLE table_name (col1 INTEGER NOT NULL CHECK (col1 > 0), col2 VARCHAR(50));
- ALTER TABLE table_name ADD CHECK (col1 > 0);
- DEFAULT
- set default value for column
- CREATE TABLE table_name (col1 INTEGER NOT NULL UNIQUE, col2 VARCHAR(50) DEFAULT 'value');
- PRIMARY KEY
= unique identifier of row per table, can be used to refer to other tables
- cannot be null
- must be unique
- can be single or multiple columns (called compound key)
- can be defined in CREATE TABLE or ALTER TABLE statements as PRIMARY KEY
- CREATE TABLE table_name (col1 Type KEY DEFINITION, col2 Type, col3 Type);
- CREATE TABLE customers (customer_id INTEGER PRIMARY KEY, name VARCHAR(50),
surname VARCHAR(50) NOT NULL, telephone INTEGER);
- CREATE TABLE table_name (col1 Type, col2 Type, CONSTRAINT pk_name PRIMARY KEY col1);
- FOREIGN KEY
- relates two tables, refers to a column (often primary key) in a different table
- restricts actions that would destroy links between tables
- enforces referential integrity, reinforces relationships between tables (parent/referenced
and child/referencing tables)
- can be non-unique
- allows null values
- called recursive/self-referencing foreign key if primary key and foreign key in same table
- to insert foreign key value into child table, that value must be present in parent table
- table can have more than one foreign key, and each can refer to a different parent table
- when record deleted in parent table
- ON DELETE CASCADE - remove record from child table
- ON DELETE SET NULL - set record to NULL in child table
- otherwise not allowed to delete in parent table
- define ON DELETE and ON UPDATE actions
- CASCADE - push change from parent table to child table
- SET NULL - if value changed in parent table, set to NULL in child table
- RESTRICT - parent value cannot be edited, default
- CREATE TABLE table_name (col1 Type PRIMARY KEY, col2 Type,
col3 Type FOREIGN KEY REFERENCES table_name2 (col2) ON UPDATE CASCADE ON DELETE CASCADE);
- ALTER TABLE table_name ADD CONSTRAINT constraint_name FOREIGN KEY (col_that_it_applies_to)
REFERENCES table_name2 (col2_that_it_applies_to);
- ALTER TABLE table_name DROP FOREIGN KEY constraint_name; to drop foreign key
- SET foreign_key_checks = 0; to disable foreign key checks
- SET foreign_key_check = 1; to enable foreign key checks
- data integrity
- parent table with unique primary keys
- referential integrity - aggregate information across sources
- child table with its own primary key and a foreign key to link back to parent table
- database relationships
- one to one
- one to many
- many to many
- many to one
- core commands DDL
- DDL = Data Definition Language
- subset of SQL statements that change structure of db schema
- CREATE
- ALTER TABLE
- change an existing table, add/remove columns, change data type, change constraints
- must agree with current data
- ALTER TABLE table_name ADD CONSTRAINT constraint_name constraint_type (col_that_it_applies_to)
REFERENCES table_name2 (col2_that_it_applies_to);
- ALTER TABLE table_name MODIFY COLUMN column Type;
- DROP TABLE
- removes table and all associated data
- error if table is a foreign key to another table
- DROP TABLE table_name;
- TRUNCATE
- COMMENT
- RENAME
- DML commands
- DML = Data Manipulation Language
- SELECT
- INSERT
- UPDATE
- modify columns in a single table
- use SET and WHERE keywords
- UPDATE table_name SET table_name.col1 = new_value WHERE table_name.col2 = value;
- DELETE
- delete one or more rows permanently
- DELETE FROM table_name; to empty table permanently
- DELETE FROM table_name WHERE table_name.col = value;
- MERGE
- CALL
- EXPLAIN PLAN
- LOCK TABLE
- DCL = Data Control Language
- GRANT
- REVOKE
- TCL = Transaction Control Language
- COMMIT
- ROLLBACK
- SAVEPOINT
- SET TRANSACTION
- SHOW CREATE TABLE table; to show table information
- homework task 1
- review slides - done
- write queries for parts db - done and submitted
- homework task 2
- create shop db with tables - done
##############################################################################################################
Session 3 notes - data analysis
- comparison operators
- relational opearators that compare values of operands
- returns True or False
- <, >, <=, >=, =, != or <>
- condition keywords
- AND, OR, IS NOT (only used for NULL)
- SELECT alias.column FROM table_name alias WHERE alias.column = condition
AND alias.column2 > condition OR alias.column IS NOT NULL;
- logical operators
- searching for patterns or searching without exact search criteria
- BETWEEN, NOT BETWEEN - values within a range
- LIKE - character string matching a specified pattern using % to indicate
other characters
- wildcard % - zero, one or however many characters
- wildcard _ - only one single character
- LIKE '%m' - any string ending with 'm'
- LIKE 'm__%' - any string beginning with 'm' and at least three characters long
- LIKE 'm%t' - any string beginning with 'm' and ending with 't'
- IN - multiple values included in a WHERE clause
- IS - similar to equals, to be used when values may be NULL
- SELECT alias.column FROM table_name alias
WHERE alias.column BETWEEN value1 AND value2;
- SELECT alias.column FROM table_name alias
WHERE alias.column LIKE 'M%';
- SELECT alias.column FROM table_name alias
WHERE alias.column IN (value1, value2, value3);
- aggregation and order
- organise, understand, interpret and analyse data
- ORDER BY
- used with SELECT, comes after WHERE
- need to specify one or more columns to order by, must be present in SELECT clause as well
- ASC and DESC, ASC by default
- SELECT alias.column FROM table_name alias ORDER BY table_name.column;
- set functions
- 'helper' functions, used in place of columns in SELECT clause
- compute a new value from passed columns
- often use DISTINCT
- COUNT - count all column values specified, includes NULL values if * used,
* used to count all rows which fit into groups
- MAX - excludes NULL
- MIN - excludes NULL
- SUM - excludes NULL
- AVG - average of all values in column, excludes NULL
- LISTAG - transforms values from a group of rows into a delimited string
- and many others, incl. trig and log functions such as sin, cos, tan, asin,
sinh, cosh, tanh, log(base, value), ln(value)
- SELECT SUM(alias.column) FROM table_name alias;
- SELECT COUNT(DISTINCT alias.column) FROM table_name alias;
- GROUP BY
- groups rows that have same values
- allows multiple columns with a set function - breaks results into subsets,
runs set function against each subset, returns 1 row of results per subset
- summarises data
- columns specified in GROUP BY must appear in SELECT clause
- SELECT COUNT(alias.column), alias.column FROM table_name alias
GROUP BY alias.column;
- HAVING
- WHERE cannot be used with aggregate functions -> HAVING added
- filters records that work on summarised GROUP BY results
- WHERE and HAVING can be used in same query
- SELECT COUNT(DISTINCT alias.column) alias.column FROM table_name alias
GROUP BY alias.column HAVING COUNT(DISTINCT alias.column) > value;
- GROUP BY aggregates data and creates subsets, HAVING restricts GROUP BY
- homework
- review notes - done
- finish shop queries - done
##############################################################################################################
Session 4 - data analysis for complex structures
- JOIN
- combines columns from two or more tables, based on a related column between them
into a single result set
- can only join columns of same data type
- can do multiple JOIN statements in one query
- INNER JOIN
- returns rows when there is at least one match in both tables
- only returns rows in common between tables
- doesn't accept NULL values
- SELECT alias1.column, alias2.column FROM table1 alias1
INNER JOIN table2 alias2 ON alias1.column = alias2.column;
- LEFT OUTER JOIN or LEFT JOIN
- all rows from the left side with the matching rows from right side returned
- gives NULL if no values matching in right table
- SELECT alias1.column, alias1.column, alias2.column FROM table1 alias1
LEFT JOIN table2 alias2 ON alias1.column = alias2.column;
- RIGHT OUTER JOIN or RIGHT JOIN
- all rows from the right table with the matching rows from left side returned
- gives NULL if no values matching in left table
- SELECT alias1.column, alias1.column, alias2.column FROM table1 alias1
RIGHT JOIN table2 alias2 ON alias1.column = alias2.column;
- FULL OUTER JOIN or FULL JOIN
- returns rows from either table when conditions are met
- returns NULL if no values match
- equivalent to LEFT JOIN and RIGHT JOIN
- doesn't work in MySQL
- CROSS JOIN
- matches all rows to all rows - returns cartesian product
- SELECT alias1.column, alias1.column, alias2.column FROM table1 alias1
CROSS JOIN table2 alias2 ON alias1.column = alias2.column;
- self join
- uses same table with different aliases to join a table to itself
- no special syntax, can use RIGHT JOIN
- EQUI JOIN and NON-EQUI JOIN
- uses / doesn't use equal sign as comparison
- NATURAL JOIN
- all columns in two tables with same name
- KEYWORD JOIN
- uses keyword for joining tables when columns have same name in both tables
- UNION
- combines two or more result sets of queries into a single result set
- tables get longer, not wider
- stacks tables on top of each other (doesn't create longer rows)
- tables must have same columns in same order, doesn't match on a specific column
- by default excludes duplicate rows (only includes it once)
- UNION ALL to retain duplicate rows
- subquery
- nested query where results of one query used in another query using a relational
operator or aggregation function
- must be in parentheses
- can only have one column in SELECT clause if used in WHERE clause
- ORDER BY clause not allowed in subquery
- can be nested within other subqueries
- used in WHERE, HAVING, FROM and SELECT
- much quicker than joins
- homework
- review notes - done
- parts queries - done
##############################################################################################################
Session 5 - data manipulation with SQL
- functions
- built-in and user-defined
- piece of code performs operations and returns a result
- most common types: string functions, numeric functions, date functions
- when used with SELECT statement, data not changed
- used with UPDATE statement to update data using a function
- built-in examples:
- length()
- upper()
- lower()
- concat()
- date_format()
- cast()
- trim()
- reverse()
- now()
- power()
- transactions
- manilupating data and retaining changes
- unit of work that is performed against a database in a logical way
- after a transaction
- commit > apply change to db
- roll back > do not retain change
- only complete if each part of operation fully successful
- prevents partial operations from being committed to db
- maintain referential data integrity, minimise data errors
- MySQL automatically commits (SET autocommit to ON or OFF)
- START TRANSACTION; / BEGIN; / BEGIN WORK;
run code
COMMIT; / ROLLBACK;
- cannot rollback after you commit
- using IF statements
- IF(condition, variable i.e. column, what to display if condition not fulfiled)
- IF(balance > 0, balance, 0)
- variables
- SET @variablename = value
- table locking
- prevent other users from being able to editing/viewing a table
- ensure data integrity
- read lock
- can read but cannot write
- automatically terminated when session ends
- write operations from other sessions queued until lock released
- can be held by mutliple sessions - e.g. a team
- shared lock
- LOCK TABLE table_name READ;
update
UNLOCK TABLE;
- write lock
- table not visible to other users
- can only be held by one session
- exclusive lock
- LOCK TABLE table_name WRITE;
update
UNLOCK TABLE;
- LOCK TABLES table1 READ, table2 WRITE;
update
UNLOCK TABLES;
- https://www.mysqltutorial.org/mysql-table-locking/
- CFG project
- create a db with relationships
- normalised table
- homework
- review slides - done
- run through self-study - done
- project idea, summary, name + coworker - done
##############################################################################################################
Session 6 - data management using SQL coding techniques
- views
- virtual table based on result of SQL statement
- do not store data, virtual snapshot of representation of a table
- view info without directly accessing db > cannot access other info in db
- based off base tables
- can have nested views
- allow for GDPR and security restrictions by using WHERE clauses
- CREATE VIEW view_name AS SELECT query...
- updates
- updatable for simple queries > can update data not visible through view >
view inconsistent
- not updatable for complex queries
- with check option
- prevent modifying/inserting a row in such a way that it would no longer
be part of the view result
- ensure consistency of data
- doesn't store any data, just query definition
- 4 types of stored programs
- stored procedure
- stored function
- trigger
- event
- stored procedures
- subroutine available to applications that access a rdbs
- one or more SQL statements
- used for data validation and access control methods
- called sproc or procedures
- can pass multiple parameters
- advantages
- can be reused
- decrease in network traffic
- encapsulation of business logic
- delegation of access rights
- protection from SQL injection
- transaction management
- improved performance
- DELIMITER //
CREATE PROCEDURE proc_name (input_parameters TYPE)
BEGIN
DECLARE variable;
proc_logic
END //
DELIMITER ;
CALL proc_name(input_parameters);
- set delimiter since ; may need to be used within the procedure definition
- can have multiple begin/end statements
- output can be a table
- stored functions
- executable db object with SQL procedural code
- also called user defined functions
- cannot modify db by INSERT, UPDATE or DELETE statements
- MySQL > scalar functions only > single value returned
- can be used in SELECT statements
- DELIMITER //
CREATE FUNCTION func_name (param TYPE)
RETURNS TYPE
BEGIN
DECLARE variable;
func_logic
RETURN(result)
END //
DELIMITER ;
CALL func_name(input_parameters);
- comparison between procedures and functions
- procedure: returns many values, input and output parameters, can't be used in
SELECT, can call functions, may not return value, read and modify data,
INSERT/SELECT/UPDATE/DELETE, transaction management
- function: returns one value, input parameters only, can be used in
SELECT, cannot call procedures, must return value, read data only,
SELECT only, no transaction management
- AUTO_INCREMENT PRIMARY KEY
- to automatically increate value of primary key by one when adding in rows
- triggers
- blocks of code executed automatically when DML operations are executed
(e.g. INSERT, UPDATE, DELETE)
- trigger execution called trigger firing
- format
- FOR EACH ROW clause for a row-level trigger
- OLD keyword gets a value from row that is being updated/deleted
- NEW keyword gets a value from row that is being inserted/updated
- note: NEW for INSERT and UPDATE and OLD for UPDATE and DELETE
- enforce rules for data consistency and integrity
- maintain the audit table
- 2 types
- before trigger
- after trigger
- when updating the same table trigger is based on:
CREATE TRIGGER trigger_name
BEFORE/AFTER INSERT/UPDATE/DELETE
ON table_name
FOR EACH ROW
trigger body e.g. SET NEW.updated = curdate();
- when updating a different table, include UPDATE statement:
CREATE TRIGGER trigger_name
BEFORE/AFTER INSERT/UPDATE/DELETE
ON table_name
FOR EACH ROW
UPDATE table_name2
trigger body e.g. SET NEW.updated = curdate();
- include UPDATE table_name2 statement only if updating a different table
- can use BEGIN END without triggers to include multiple statements
- remember to use NEW/OLD keywords
- events
- blocks of code executed automatically according to event scheduler
- event execution called event firing
- 2 types
- one time
- recurring
- maintenance tasks
- to show existing events:
SHOW PROCESSLIST;
SHOW EVENTS FROM db;
- to create an event:
CREATE EVENT [IF NOT EXISTS] event_name
ON SCHEDULE schedule
ON COMPLETION PRESERVE
DO
body of event e.g. INSERT INTO ...;
- ON COMPLETION PRESERVE retains the event
- can use relative schedules, e.g. CURRENT_TIMESTAMP + INTERVAL 1 MINUTE
- example recurring schedule:
ON SCHEDULE EVERY 1 MINUTE
STARTS CURRENT_TIMESTAMP
ENDS CURRENT_TIMESTAMP + INTERVAL 1 HOUR
- to drop an event:
DROP EVENT event_name
- homework
- review slides - done
- write function, procedure, trigger and event - done
- link back to project - done
- download and install tableau public (free version) - done
##############################################################################################################
Session 7 - db design and data visualisation
- mysqldump and mysql commands
- back up one or multiple dbs or a table
- can back up logic, not just tables
- use cmd to backup
- find mysqldump file - located in
Program Files/MySQL/MySQL Server 8.0/bin
- dir to view all files in a folder
- using localhost and root user:
mysqldump options > folder
e.g.
mysqldump -h 127.0.0.1 -u root -p bakery > C:\bakery\bakerybackup.sql
mysqldump -u root -p --databases bakery bank > C:\bakery\bakery_bankbackup.sql
mysqldump -u root -p --all-databases > C:\backups\all_dbs.sql
mysqldumpt -u root -p bakery sweet > C:\backups\bakery_sweettable.sql
--no-data to include db structure only
--no-create-info to include data only
--triggers --events --routines
- db restore
- always test
- to restore:
mysql options < folder
e.g.
mysql -h 127.0.0.1 -u root -p bakery < C:\bakery\bakerybackup.sql
- can restore from workbench
- entity relationship diagram
- workbench - reverse engineering to display the diagram
- workbench > db > reverse engineer
- can import data from csv
- can export results into csv
- homework
- review slides - done
- project - done
- include min 5 tables, ER diagram
- primary and foreign keys
- joins, stored function, query with subquery
- 3 out of: stored procedure, trigger, event, view with 3-4 base
tables plus query, group by and having
##############################################################################################################
Session 8 - project presenations
- show data source
- show ER diagram and explain links between tables
- show queries
##############################################################################################################
Session 9 - extra info session
- run SQL queries to get data out of db, then use python and R to build info
- what to expect in a tech interview
- wordy part
- skills part
- can be take-home
- in interview
- in programme on screenshare
- on whiteboard - harder since can't test
- will make mistakes > don't panic and learn to read/fix errors in a calm manner
- practice both options
- when to apply for job
- research the job / company in detail
- courses
- code academy (not free)
- youtube
- data science - need a solid understanding of algorithms + coding
- regression models - easy to explain, decision trees, clustering, neural networks
- mock interview
- example query: top 3 authors who sold most books in total
SELECT a.author_name, SUM(b.sold_copies) total_copies_sold
FROM authors a
INNER JOIN books b ON a.book_name = b.book_name
GROUP BY a.author_name
ORDER BY SUM(b.sold_copies) DESC
LIMIT 3;
- example query: print every department where the average salary per employee is lower than £500
SELECT e.department_name, SUM(s.salary)/COUNT(s.salary) AS average_salary
FROM employees e
INNER JOIN salaries s
ON e.employee_id = s.employee_id
GROUP BY e.department_name
ORDER BY average_salary ASC
HAVING average_salary < 500;
can use AVG(salaries.salary)
- example questions
- what is MySQL? what is SQL? db management software, db querying language
- default port: 3360
- how can you find out version of installed MySQL? run a command
SHOW VARIABLES LIKE “%version%”;
- differences between char and varchar? char is fixed length,
varchar is anything up to set value, char is faster
- filter duplicate data: use DISTINCT
- statement is SELECT query for partial matching: LIKE or regex
- change name of existing table using SQL statement:
RENAME TABLE table_name TO new_name
- add/remove column in table:
ALTER TABLE table_name ADD COLUMN column_name column_definition [FIRST|AFTER existing_column]
- what is a view? virtual table, returns data without creating a table in db,
table can be seen but isn't maintained in db
- difference between primary key and unique key: primary can never be null,
unique key can be null
- what is a join? method of combining two or more tables together
- 4 main types of join: left (keeps all rows in first table),
right (keeps all rows in second table),
inner (keeps rows which are in both tables),
outer (keeps rows in one table only)
- sum values: use SUM(), must be numeric
- transaction: group of db operations done as single unit, every operation
must be successful for transaction to be successful, can commit / roll back
- functions of commit and roll back statements: commit to push changes to db,
roll back to return to previous state