Highly selective parent-child SQL query

When I write reports, I often times run into parent-child relationships where I only need one particular child row. And sometimes a group by simply does not cut it because max, average, sum, or some of those aggregate functions simply are not a good fit for your situation when you want some info (or the entire row) of the child that was modified last, or that has the largest amount in one column, etc.

So I use a combination the top command with an order by as seen in the query below. On the query the parents are on the SyStudent table and the children come from the AdEnroll table. There can be many AdEnrolls for a SyStudent.

For each SyStudent (parent) I’m selecting all the info of the AdEnroll (child) that was added most recently.

select stu.firstname, stu.lastname, enr.startdate, enr.gpa
from systudent stu 
inner join adenroll enr on enr.adenrollid = (
  -- get the stu's enrollment that was created most recently
  select top(1) enr2.AdEnrollId
  from AdEnroll enr2
  where enr2.systudentid = stu.systudentid
  order by enr2.dateadded desc)
Advertisements

7 thoughts on “Highly selective parent-child SQL query

  1. The only thing I can think of is put it in a temporary table first filtered set of data. Then put the next filtered set of data in another temp table, and then perform the work from that. I think a local temp table is with # or @ forget which. If this data can be polled only once a day you can most likely put it into a Global temp table ## or @@ depending on the DB environment. And it will be good until the system reboots. But it might be better to place your code into a Stored proc or function.

    If the data is really huge.. you might also want to add another column to the table to rank the information, and further filter it so it will respond faster? Maybe? 😉

    Marc

  2. Hi Nooner,

    Yeah that would work, a similar though crossed though my mind yesterday.

    Performance is still acceptable but if it gets to to be an issue I would create some warehouse (WH) tables in the same database where I have all the other tables.

    These WH tables would have crunched data and the data would be recalculated every 10 mins or so via a SQL job. My report would then access the WH tables just to get the data and display it with little or no processing.

    • I’ll have to look into WH tables for SQL 2008. I have been using SQL Agent for everything under the sun too, I guess it’s good for scheduled jobs. So do you use the WH on a replication server(copy of the production DB) or do you use it on the production?

      • Well, so far I don’t have a WH for this database. If I end up needing to speed up the report that uses this query, I would really just create regular tables to store the crunched data in the same production database and have a job that populates them every 10 mins or so. Make my own micro warehouse.

        I have read about data warehouses, but to be honest I have never used the SQL Server warehouse functionality. I’ve only done what I described in the paragraph above on a couple of applications I’ve created.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s