CEDAR AI Reference
A compact reference for writing CEDAR cones with AI assistance
This page is designed to be pasted directly into an AI assistant prompt (Claude, ChatGPT, etc.) along with your question. It gives the AI the data schema, cone pattern, and conventions it needs to write correct, working CEDAR code.
How to use it:
- Copy everything from the horizontal rule below to the end of this page
- Paste it into your AI chat
- Add your question: “Write a CEDAR cone that [describes what you want to know]”
- Review the output — check that column names and join keys match exactly
CEDAR Data Model
CEDAR organizes institutional data into five standard tables. All column names use snake_case.
cedar_sections — one row per course section per term
| Column | Type | Description | Example |
|---|---|---|---|
term | integer | 6-digit term code (YYYYCC) | 202580 (Fall 2025) |
crn | string | Course Reference Number | “12345” |
subject | string | Subject code | “MATH” |
course_number | string | Course number | “1350” |
subject_course | string | Combined (used for filtering) | “MATH 1350” |
section | string | Section number | “001” |
course_title | string | Course title | “Calculus I” |
campus | string | Campus | “Main”, “Online” |
college | string | College code | “AS” |
department | string | Department code | “MATH” |
instructor_id | string | Encrypted instructor ID | join key to cedar_faculty |
instructor_name | string | Instructor name | “Smith, Jane” |
enrolled | integer | Current enrollment count | 28 |
capacity | integer | Section capacity | 30 |
status | string | Section status | “A” (active), “C” (cancelled) |
delivery_method | string | Mode of delivery | “F2F”, “Online”, “Hybrid” |
level | string | Course level | “lower”, “upper”, “grad” |
term_type | string | Term type | “fall”, “spring”, “summer” |
part_term | string | Part of term | “FT” (full), “1H”, “2H” |
crosslist_group | string | Crosslist group ID (NA if none) | “XL-12345” |
crosslist_role | string | Role in crosslist | “home”, “partner”, NA |
is_split | logical | Split-level crosslist flag | TRUE/FALSE |
Key note: Term codes end in 10 (spring), 60 (summer), 80 (fall). Course level: below 300 = “lower”, 300–499 = “upper”, 500+ = “grad”.
cedar_students — one row per student per section (class lists)
| Column | Type | Description | Example |
|---|---|---|---|
enrollment_id | string | Unique row ID | — |
section_id | string | FK to cedar_sections | “202580-12345” |
student_id | string | Encrypted student ID | join key across tables |
term | integer | Term code | 202580 |
subject_course | string | Course (denormalized) | “MATH 1350” |
campus | string | Course campus | “Main” |
college | string | Course college | “AS” |
department | string | Course department | “MATH” |
registration_status_code | string | Status | “RE” (registered), “DR” (dropped), “W” (withdrawn) |
final_grade | string | Final grade | “A”, “B+”, “C”, “W”, “F”, “I” |
student_level | string | Student level | “UG”, “GR” |
student_classification | string | Class standing | “FR”, “SO”, “JR”, “SR” |
major | string | Student’s major code | “MATH-BS” |
student_college | string | Student’s college | “AS” |
term_type | string | Term type | “fall”, “spring”, “summer” |
Key note: student_id is encrypted — never plaintext. Use it as a join key across tables to track the same student across terms. Passing grades: A, B, C. DFW grades: D, F, W, I (incomplete).
cedar_programs — one row per student-program per term
| Column | Type | Description | Example |
|---|---|---|---|
student_id | string | Encrypted student ID | join key |
term | integer | Term code | 202580 |
program_type | string | Type | “Major”, “Minor”, “Concentration” |
program_name | string | Full program name | “Mathematics BS” |
college | string | Program’s college | “AS” |
department | string | Program’s department | “MATH” |
student_level | string | UG or GR | “UG” |
cedar_degrees — one row per degree awarded
| Column | Type | Description | Example |
|---|---|---|---|
student_id | string | Encrypted student ID | join key |
term | integer | Graduation term | 202510 |
degree | string | Degree type | “BS”, “BA”, “MS”, “PhD” |
program_name | string | Program name | “Mathematics BS” |
college | string | College | “AS” |
department | string | Department | “MATH” |
graduation_status | string | Status | “Conferred”, “Pending” |
cedar_faculty — one row per instructor per term
| Column | Type | Description | Example |
|---|---|---|---|
instructor_id | string | Encrypted ID (matches cedar_sections) | join key |
instructor_name | string | Full name | “Smith, Jane D.” |
term | integer | Term code | 202580 |
department | string | Home department | “MATH” |
job_category | string | Employment type | “Professor”, “Associate Professor”, “Assistant Professor”, “Lecturer”, “Term Teacher”, “TPT”, “Grad” |
appointment_pct | numeric | FTE as decimal 0.0–1.0 | 1.0 = full-time, 0.5 = half-time |
Key note: Only “Professor”, “Associate Professor”, “Assistant Professor”, and “Lecturer” count as permanent faculty for SFR calculations. appointment_pct is always 0.0–1.0, not a percentage.
Table relationships
cedar_sections
├── cedar_students (join on section_id)
│ ├── cedar_programs (join on student_id)
│ └── cedar_degrees (join on student_id)
└── cedar_faculty (join on instructor_id + term)
Cone pattern
A cone is an R function that accepts CEDAR data, filters it via an options list, analyzes it, and returns a standardized result.
Function signature
my_cone <- function(students, courses, opt, additional_data = NULL) {
# students = cedar_students
# courses = cedar_sections
# opt = named list of filters and options
# additional_data = optional: list with degrees, programs, faculty
}
Standard options (opt)
Common fields in the options list:
opt <- list(
term = 202580, # integer term code; may be a vector for multi-term
dept = "MATH", # department code filter (optional)
college = "AS", # college code filter (optional)
course = "MATH 1350", # subject_course filter (optional)
level = "upper", # course level filter (optional)
status = "A", # section status filter; default "A" for active only
uel = TRUE # use exclude list (ignore certain sections)
)
Full cone template
# [cone-name].R
# Question: [One sentence describing what question this cone answers]
#' [Function title]
#'
#' [Brief description of the approach and what the result contains]
#'
#' @param students cedar_students data frame
#' @param courses cedar_sections data frame
#' @param opt Named list of options (term required; dept, college, course optional)
#' @return Named list with: data (data frame), metadata (list)
my_cone <- function(students, courses, opt) {
# 1. Validate required options
validate_cone_options(opt, required = c("term"))
# 2. Set defaults
opt <- set_cone_defaults(opt, list(
status = "A"
# add cone-specific defaults here
))
# 3. Filter sections
sections <- courses %>%
filter(
term %in% opt$term,
status == opt$status
)
if (!is.null(opt$dept)) sections <- sections %>% filter(department == opt$dept)
if (!is.null(opt$college)) sections <- sections %>% filter(college == opt$college)
if (!is.null(opt$course)) sections <- sections %>% filter(subject_course == opt$course)
# 4. Filter students to matching sections
enrolled <- students %>%
filter(section_id %in% sections$section_id)
# 5. [Your analysis here]
result <- enrolled %>%
group_by(subject_course, term) %>%
summarize(
n_students = n_distinct(student_id),
# ...
.groups = "drop"
)
# 6. Return standardized result
return(list(
data = result,
metadata = list(
function_name = "my_cone",
options_used = opt,
row_count = nrow(result)
)
))
}
Common patterns
Track students across terms (course sequences)
# Students who took course A, then look up their outcome in course B
course_a_students <- cedar_students %>%
filter(subject_course == "CHEM 1215", term == 202480) %>%
select(student_id, grade_in_a = final_grade)
course_b_students <- cedar_students %>%
filter(subject_course == "CHEM 1225", term == 202510) %>%
select(student_id, grade_in_b = final_grade)
sequence <- course_a_students %>%
inner_join(course_b_students, by = "student_id")
DFW rate by group
cedar_students %>%
filter(term %in% opt$term, !is.na(final_grade)) %>%
group_by(subject_course) %>%
summarize(
total = n(),
dfw = sum(final_grade %in% c("D", "F", "W", "I")),
dfw_rate = dfw / total
)
Join sections to faculty
sections %>%
left_join(
cedar_faculty %>% select(instructor_id, term, job_category, appointment_pct),
by = c("instructor_id", "term")
)
Enrollment by level over time
cedar_sections %>%
filter(status == "A", department == opt$dept) %>%
group_by(term, term_type, level) %>%
summarize(
sections = n(),
enrolled = sum(enrolled),
.groups = "drop"
)
Prompting tips
- Be specific about the question: “What predicts success in CHEM 1225 given prior CHEM 1215 grade?” is better than “analyze course outcomes.”
- Name the tables you need: Tell the AI which of the five tables are relevant. Most questions use
cedar_studentsandcedar_sections; cross-term questions need joins onstudent_id. - Ask for the opt pattern: Request that the cone accept
opt$dept,opt$term, etc. so it’s generalizable. - Verify column names: AI models may invent plausible-sounding column names. Check every column reference against the schema above.
- Check join keys: The most common error is joining on the wrong column. Confirm: sections join students on
section_id; student-level joins across tables usestudent_id.