IFB105 — Learn Everything Flashcards Practice exam
0 of 0 topics marked done

Learn the whole unit — no PDFs required

This page teaches every examinable topic in IFB105 from scratch, in the order the unit builds them: model the world → map it to tables → clean the tables → query them → run them safely → do it ethically.

Open book · paper notes only 2 hours whole semester CSDP steps 6–7 not examined

0 How it all fits together

A database stores facts about the real world. The unit is one pipeline for building one properly. Keep this order in your head — most exam questions are "do step X" or "spot the mistake in step X".

  1. Understand the domain → write it as plain-English elementary facts (ORM Step 1).
  2. Model it → draw an ORM diagram (the conceptual schema) using the 7-step CSDP.
  3. Convert to tablesRelational Mapping (RMap) turns the diagram into table definitions.
  4. Clean the tablesNormalisation removes redundancy (1NF → 2NF → 3NF).
  5. Build & querySQL (DDL creates tables, DML inserts/queries data).
  6. Run it safelysecurity & administration (views, roles, access control, transactions).
  7. Do it ethicallylegal & ethical issues, privacy law and Indigenous data principles.

1 Databases & the relational model

A database is an organised collection of related data. A DBMS (e.g. MySQL) is the software that stores, manages and queries it. In the relational model, data lives in tables (also called relations).

  • A row (a tuple / record) = one instance, e.g. one student.
  • A column (an attribute / field) = one property, e.g. studentName.
  • Every row should be unique.
  • Each cell holds a single atomic value — one value, not a list.
Information = Data + Semantics (the data plus its meaning).

Why multiple tables? Redundancy & anomalies

If you cram everything into one table you get data redundancy — the same fact stored many times. That's bad for two reasons:

  • Integrity — if the fact changes you must update every copy; miss one and the data is inconsistent.
  • Efficiency — wasted storage and slower entry.

Redundancy causes three anomalies (memorise these — common MCQ):

AnomalyMeaning
Update / modificationYou must change the same fact in many rows.
InsertionYou can't add a new fact without also supplying unrelated data.
DeletionDeleting a row accidentally destroys other facts you wanted to keep.
The guiding principle: each row is an elementary fact, and we don't repeat facts. Splitting data across linked tables (and normalising) fixes this.

The three-level ANSI-SPARC architecture

Databases are described at three levels so storage can change without breaking the programs that use it:

LevelWhat it is
ExternalWhat individual users/apps see — their own views (many external schemas).
ConceptualThe whole logical structure — entities, relationships, constraints. This is the level ORM models.
InternalHow data is physically stored — files and indexes.

Data independence is the payoff: you can change one level without affecting the others (logical & physical independence).

2 Legal, ethical & Indigenous data issues

Ethics = a set of principles of correct conduct / a system of moral values (Connolly & Begg). As Potter Stewart put it: "knowing the difference between what you have the right to do and what is right to do."

Legal vs ethical: they're not the same. Laws lag behind technology, so ethics fills the gap. Not all unethical behaviour is illegal, and not all legal behaviour is ethical.

Why organisations care (compliance)

RegimeKey facts
NDB scheme (Notifiable Data Breaches)From 22 Feb 2018; applies at turnover ≥ $3M; fines up to $420K (individuals) / $2.1M (companies).
GDPR (EU)Max fine = 4% of global turnover or €20M, whichever is greater. Adds the "right to be forgotten".
Privacy Act 1988 (Aus)Plus the Data Retention Law (2017).

Two ethical concern areas

  • Data collection — what/how you collect; secretive or disingenuous practices (vague "Terms & Conditions", the 'Herod clause'); re-identifiability of "anonymised" data.
  • Data usage — how collected data is used; grey areas like surge pricing from personal data, reusing data collected unethically, "enhanced service" vs "creepy".

Professional codes you can cite: ACM, ACS (Australian Computer Society), BCS (British Computer Society).

Indigenous data & the CARE principles

Indigenous Data Sovereignty (IDS) = the right of Indigenous peoples to govern the collection, ownership and application of data about their communities. Indigenous Data Governance (IDG) is how that data is managed to support IDS. The unit's exam ethics question is almost always: "does this scenario comply with [a letter] of CARE?"

CARE (GIDA 2019) — learn all four:

LetterPrincipleMeans
CCollective BenefitData use must support community outcomes — inclusive development, governance, equitable outcomes.
AAuthority to ControlCommunities decide what data is collected, accessed and shared.
RResponsibilityRespect cultural protocols, avoid harm, build positive relationships, support Indigenous worldviews.
EEthicsHonour worldviews; minimise harm, maximise benefit; justice; consider future use.

CARE complements FAIR (Findable, Accessible, Interoperable, Reusable).

The Havasupai case (the model answer pattern). A tribe gave blood consenting only to a diabetes study; researchers later reused it for mental-illness/alcoholism/migration studies without consent, and published stigmatising results. Assess against CARE — every letter is "does NOT comply":

  • C — the tribe was excluded from the reuse; later studies gave no benefit and were harmful.
  • A — no informed consent for reuse, no say in storage/use, data not accessible to them.
  • R — research disrespected dignity/worldviews; no positive relationship maintained.
  • E — stigmatising use, ignored power imbalance and appropriate future use.
For any "does this comply with [letter]?" question: (1) state what that letter means, (2) point to the specific facts, (3) conclude comply / does not comply. Three sentences = full marks.

Securing data (the security toolkit)

  • Views — a view is a stored/named query ("virtual table"). Grant a user SELECT on a view that exposes only safe columns → hides sensitive columns and the base table. The main security mechanism examined.
  • Backup vs recovery — backup = periodic copy of the DB and log file offline; recovery = replace a damaged DB with a backup. MySQL: Hot/Logical (server stays online, mysqldump) vs Cold/Physical (server offline, faster file copy).
  • Journaling — a log of all changes; minor failure → auto-recover from the journal, major failure → restore from backup.
  • Auditing (Oracle) — statement, privilege, object auditing.
  • Encryption — a cipher makes data unreadable without the key. Public/asymmetric (encrypt with public key, decrypt with private) vs Private/symmetric (same key both ways).
  • RAID — Redundant Array of Independent Disks → fault tolerance (disks are the most failure-prone component).

3 Object-Role Modelling (ORM)

ORM models the conceptual schema as objects playing roles. You start with plain-English facts and end with a diagram, following the 7-step CSDP (Conceptual Schema Design Procedure).

Elementary facts (CSDP Step 1)

An elementary fact is a simple, atomic assertion that particular objects play particular roles — it can't be split into smaller facts about the same objects.

Elementary facts must not use logical connectives (not, and, or, if) or quantifiers (all, some).
✗ "Mary and David work in Sales" (split into two) · ✗ "All staff earn $50k" · ✓ "The Student with ID 'n7' is enrolled".

Objects: entities vs values

  • Value — a self-identifying constant (e.g. "Australia", 17). Drawn as a dashed rectangle.
  • Entity — needs a definite description. Drawn as a solid rounded rectangle.

Fact types & reference modes

An entity is identified by three things: entity type + reference mode + value — e.g. Employee (.ID) 'E0002'. (In "The Lecturer with StaffID 'E22'…", StaffID is the reference mode — a classic MCQ.)

A predicate is a sentence template with 'object holes'; its arity = the number of roles: unary (1), binary (2), ternary (3), quaternary (4). A fact type = object type(s) + a predicate, e.g. Employee works in Department.

Surface vs deep structure: "Mary works in Sales" and "Sales employs Mary" look different but are the same fact type (same deep structure).

Counting objects. "The Customer 'C3' bought the Ticket 'T8' for the Event 'E1'" → 3 objects (Customer, Ticket, Event). "The Ticket 'T8' is refunded" → 1 object (unary).

Diagram notation — how to read & draw it

Employee (.ID) Entity type: solid rounded box, with (reference mode) EmpName Value type: dashed box works in ▶ Emp Dept Binary predicate = two role boxes between two object types Uniqueness bar (over a role) = that side is unique Mandatory-role dot = every instance must play this role
The ORM shapes you must recognise and draw.

Constraints & reading multiplicity

A uniqueness constraint (UC) bar means the values under that role (or combination) are unique. Read multiplicity from where the bar sits on a binary:

Where the UC bar isMultiplicity
Under one role only1:N (functional) — that side is the "one" side
Spanning both rolesm:n (many-to-many)
A separate bar on each role1:1 (one-to-one)

A mandatory-role dot means every instance of that object type must play that role (no NULLs there). A "?" in a population = NULL = optional role.

"Exactly one" = uniqueness (at most one) + mandatory (at least one), together.

The n−1 rule & legal diagrams

  • n−1 rule: in an n-ary fact type, a uniqueness constraint must span at least n−1 roles.
  • Legal-diagram rules: objects must play a role (you can't join two object types directly); each role connects to exactly one object type (no role played by two types; no dangling role).
Derived facts are marked with * and a rule (e.g. "for each Article, markup = retailPrice − wholesalePrice") and are never stored — storing derivable data wastes space and can go wrong when base facts change.

The 7-step CSDP

The procedure that turns facts into a diagram. Steps 6–7 are not examined, but know the first five cold.

  1. Transform familiar examples into elementary facts.
  2. Draw the fact types and apply a population check (populate with sample instances).
  3. Check for entity types to be combined; note arithmetic derivations. (Combine when the same entity belongs to two types — e.g. Stephen Chow is actor and director → one Person type.)
  4. Add uniqueness constraints and check arity.
  5. Add mandatory role constraints and check for logical derivations.
  6. Add value, set-comparison, subtyping constraints. not examined
  7. Add other constraints; final checks. not examined

4 Relational mapping (ORM → tables)

RMap converts the ORM diagram into table definitions in Optimal Normal Form (a minimal set of normalised relations). First you must know keys precisely.

Primary key Candidate key Super key
Every primary key is a candidate key; every candidate key is a super key.
KeyProperties
Super keyUnique.
Candidate keyUnique + Irreducible (no proper subset is unique).
Primary keyUnique + Irreducible + Entity integrity (no nulls).
Alternate keyAny candidate key not chosen as the primary key.
Foreign keyA column sharing a domain with a candidate key in another table; references that table.

Three inherent integrity constraints: Entity (PK unique & not null), Referential (a FK must match an existing row; deletion rules: cascade, restrict, set null, set default), Domain (values fit a data type).

RMap syntax — write it exactly like this

NotationMeans
Single underlinea uniqueness constraint (can be more than one)
Double underlinethe primary key (when other UCs exist)
[ brackets ]optional column (NULLs allowed)
dotted arrow →foreign key, pointing toward the primary key it references
Student {studentID, name, email, [phone]}
Studies {studentID, unitID}      FK studentID → Student,  FK unitID → Unit
Unit    {unitID, title}

The four mapping rules (apply in strict order 1→2→3→4)

RuleApplies toResult
1UC spanning >1 role (m:n & ternary+)its own table; PK = the UC
2entity with a simple UC (functional role)one table per entity; PK = reference mode
3nested fact typetreat as an entity (like Rule 2); PK = the nested UC
41:1 fact typeplace the fact in one relation — sub-rules 4.1–4.4

Rule 4 decision tree (1:1)

  1. Does either object play another functional role? No4.3: the 1:1 fact gets its own table.
  2. Only one does → 4.1: group the fact onto that object's table.
  3. Both do → is exactly one role mandatory? Yes4.2: group onto the mandatory side. Neither/both4.4: your discretion, minimise NULLs.

Worked RMap. Member (.memberID) has Name (1:1, mandatory) & [email]; Member enrolled in Class (.classID) is m:n; Class has ClassName; Class led by Trainer (.trainerID) is n:1; Trainer has TrainerName.

Member     {memberID, name, [email]}
Class      {classID, className, trainerID}   → trainerID FK → Trainer
Trainer    {trainerID, trainerName}
EnrolledIn {memberID, classID}              → FKs to Member, Class   (Rule 1)

5 Normalisation

Normalisation improves integrity by minimising redundancy — every non-key attribute fully depends on the whole key, and each fact is stored once. It's driven by functional dependencies.

Functional dependencies (FDs)

X → Y ("X determines Y") means: for each value of X there is at most one value of Y. X is the determinant, Y the determinee. (It does not imply causation — just identification.) A key determines the non-key attributes.

DependencyDefinitionExample
Partiala non-key attribute depends on only part of a composite key (only with composite keys)(Part, Warehouse) is the key but Warehouse → WarehouseAddress
Transitivea non-key attribute depends on another non-key attributeEmployee → Department → Location
Fulldepends on the whole key(Part, Warehouse) → Quantity

The normal forms — 1NF → 2NF → 3NF

1NFatomic values 2NFno partial dep. 3NFno transitive dep.
Each form includes the one before it. The unit goes to 3NF only.
FormRequirementFix to get there
1NFatomic values only (no repeating groups/arrays); ≥1 candidate keysplit comma-lists into separate rows
2NF1NF + no partial dependencies (only matters with a composite PK)move the partially-dependent attribute into a new table keyed on the part it depends on
3NF2NF + no transitive dependenciesmove the transitively-dependent attribute into a new table keyed on its non-key determinant

Normalise to 3NF. ENROLMENT, PK (StudentID, UnitCode): StudentName, UnitName, Grade, TutorID, TutorName. FDs: StudentID→StudentName, UnitCode→UnitName, (StudentID,UnitCode)→Grade,TutorID, TutorID→TutorName.

Partial deps: StudentName (on StudentID) and UnitName (on UnitCode). Transitive dep: TutorID→TutorName. 3NF result:

Student   {StudentID, StudentName}
Unit      {UnitCode, UnitName}
Tutor     {TutorID, TutorName}
Enrolment {StudentID, UnitCode, Grade, TutorID}   → FKs to Student, Unit, Tutor
2NF is only an issue when the PK is composite. If your PK is a single column, a 1NF table is automatically in 2NF.

6 SQL

SQL has two sub-languages here: DDL (define structure) and DML (manipulate data), plus DCL (permissions). Data integrity is enforced by constraints.

DDL — creating tables

CREATE TABLE Ticket (
  ticketID    INT PRIMARY KEY AUTO_INCREMENT,
  eventID     INT NOT NULL,
  customerID  INT NOT NULL,
  ticketType  ENUM('GA','VIP','Backstage'),
  price       DECIMAL(6,2),
  FOREIGN KEY (eventID)    REFERENCES Event(eventID),
  FOREIGN KEY (customerID) REFERENCES Customer(customerID)
);
  • NOT NULL = value required; UNIQUE = a candidate key (no duplicates).
  • Restrict values with ENUM(...) or a CHECK constraint.
  • ALTER TABLE t ADD COLUMN col TYPE; changes structure. DROP TABLE removes the whole table; DELETE removes rows only.

DML — insert / update / delete

INSERT INTO Ticket (eventID, customerID, ticketType, price) VALUES (2, 3, 'GA', 45);
UPDATE Ticket SET price = 50 WHERE ticketID = 4;     -- omit WHERE → every row!
DELETE FROM Ticket WHERE ticketID = 4;
"Will it execute?" questions test integrity: an INSERT with a FK value that has no parent row fails on referential integrity; a duplicate PK or UNIQUE value fails on entity/unique integrity.

SELECT & filtering

SELECT firstName, lastName FROM Customer
WHERE city = 'Brisbane'
ORDER BY lastName DESC;
  • Pattern matching with LIKE: % = any number of characters, _ = exactly one. So 'a%'=starts with a, '_a%'=2nd char is a, '%a'=ends with a.
  • ORDER BY is ascending by default; add DESC to reverse.
  • DISTINCT removes duplicate rows.

Grouping & aggregates

Aggregates: COUNT, SUM, AVG, MIN, MAX. GROUP BY collapses rows into groups so an aggregate is computed per group.

SELECT e.eventName, COUNT(*) AS ticketsSold
FROM Event e JOIN Ticket t ON e.eventID = t.eventID
GROUP BY e.eventID, e.eventName
HAVING COUNT(*) > 2;
WHERE filters rows BEFORE grouping; HAVING filters GROUPS after aggregation. This is one of the most common MCQs.

Set operations: UNION (combine, drop duplicates), UNION ALL (keep duplicates), INTERSECT (in both), EXCEPT (in the first but not the second).

Joins & subqueries

JoinReturns
INNER JOINonly rows matching in both tables
LEFT JOINall rows from the left table + matches (NULLs where none)
RIGHT JOINall rows from the right table + matches (NULLs where none)
CROSS JOINCartesian product (every left row × every right row)
SELECT c.firstName, c.lastName
FROM Customer c JOIN Ticket t ON c.customerID = t.customerID
WHERE t.ticketType = 'VIP';

A subquery is a query nested in another; the inner runs first. Use = for a single-row subquery, IN for a multi-row one.

SELECT eventName FROM Event
WHERE venueID = (SELECT venueID FROM Venue WHERE venueName = 'The Triffid');

DCL — permissions

GRANT SELECT, INSERT ON Orders TO clerk_role;
REVOKE INSERT ON Orders FROM clerk_role;

7 Security & administration

A role is a named bundle of privileges granted to users (simplifies admin).

  • DAC (Discretionary Access Control) — the owner grants privileges directly.
  • RBAC (Role-Based Access Control) — privileges are assigned via roles.
CREATE ROLE clerk_role;
GRANT SELECT, INSERT ON Orders TO clerk_role;
GRANT clerk_role TO user;

Transactions (ACID)

A transaction is a logical unit of work — a sequence of operations treated as one all-or-nothing action. It must be ACID:

PropertyGuarantee
Atomicityall operations happen, or none do
Consistencymoves the DB from one valid state to another
Isolationconcurrent transactions don't interfere (appear serial)
Durabilityonce committed, changes survive crashes

TCL commands: COMMIT (save), ROLLBACK (undo), SAVEPOINT (partial-rollback marker).

8 Exam technique

It's open book (unlimited paper notes, no digital), 2 hours, on-campus and handwritten, covering the whole semester. The lecturer warns: be selective — don't waste time flicking through too many pages.

  • Section A (MCQ): do these first, fast — they're 1 mark each. Don't overthink.
  • Section B (short answer): writing SQL, drawing ORM, doing RMaps, normalising, ethics. Show working — partial credit is real.
  • ORM: always start from elementary facts, then add UC bars, then mandatory dots. Reject any "fact" with and/or/not/if/all/some.
  • RMap: apply rules 1→2→3→4 in order; underline PKs, bracket optional columns, point FKs at PKs.
  • Normalisation: name the partial deps (→2NF) and transitive deps (→3NF) explicitly, then give the split tables.
  • SQL "will it run?": check entity integrity (duplicate PK/unique) and referential integrity (FK with no parent).
  • Ethics: for CARE, define the letter → cite the facts → conclude.
Then test yourself: hit the flashcards and sit the practice exam. Reading once isn't learning — retrieval is.