Exploring the charitable sector through a star schema

2011-02-22 (permalink tags: , , , )

We just released theSector.ca, a tool to explore the Canadian charitable sector. In a nutshell, it's an interactive directory of charities that one can use to gain insight on individual charities as well as trends such as fundraising, staffing, and expenditures across the charitable sector.

We've already covered why we believe that it's going to shake the charitable sector and now I want to take a moment to celebrate the cool tech that allowed us to make it happen.

Both Fundtracker and theSector are TurboGears 2 applications running on top of a Postgres database. In Fundtracker, we achieved very good performance through fine tuning of Postgres and sprinkling bits of denormalized data here and there. Since you need a subscription to use Fundtracker, we can keep an eye on the growth and we feel safe from overnight scalability problems but theSector being a freely accessible tool, we had to make sure that it could handle a Twitter storm and our main tool to do that is systematic denormalization into a star schema.

When used carefully, denormalization is a powerful concept. Many NO-SQL solutions require a healthy dose of denormalization and I suspect that the performance improvement that one get by switching to such a solution has a lot more to do with the data modeling constraints that they impose than with the underlying technology. However, when you take denormalization as the very goal of the exercise, you get to concentrate on modeling problems, which are likely to breed a totally new understanding of your data. Dimensional modeling enables lightning fast query performance but the genius of it is really that it makes drilling down natural and straightforward. With a good OLAP cube, not only can you see trends in real-time, you can also explain them with very little effort.


summary page of the environmental sector

TheSector features sector-wide summaries of staffing and fundraising practices. Since the source is public data, errors can occur in several places: bad reporting, data acquisition errors, logic errors during normalization, and incorrect assumptions on the meaning of the data which leads to aggregating data that does not belong together. Thank to the OLAP cube, we are able to aggregate the sector summaries on-the-fly -- all the charts are produced in real-time by looking at thousands of rows, each representing an individual charity. If we notice any discrepancy, producing a list of the organizations and of the numbers in involved is as easy as:

select * from fact_sect_financials
natural join dim_sector 
where ds_name = 'environment'

We can then quickly find out what part of the system introduced the error. SQLAlchemy and sqlalchemy-migrate definitely helped us to implement the cube. SQLAlchemy features an ORM but you don't have to use it and indeed, as much as it's natural to see an OLTP schema as a graph of objects, the OLAP cube is better seen as a collection of records. Also, the schema of the cube moves fast; as you see trends in the data, you want to drill down on new facets and therefore you add new dimension tables. Thanks to sqlalchemy-migrate, we can do that in a very agile manner without ever taking the system down. Of course, that means that we some question can't be answered until a new cube is computed but I guess this is a small price to pay.

Comments

2012-02-09 16:01:35 by homework tutor (direct link | reply)

I like the idea of the graphic indeed! It is very much helpful

2012-02-13 08:34:45 by foods that cause belly fat (direct link | reply)

it is possible to find organizations that have benefited from this assistance, but it is also possible to do a global search for NPOs. This search will go to any organization, what were its sources of funding, and therefore have a better idea of ​​the type of foundations, corporations or government programs that would make sense to go ask for another project, but similar.

2012-02-16 05:43:38 by buy dissertation (direct link | reply)

nice sharing.....

2012-02-16 05:43:59 by buy Admission essay (direct link | reply)

nicely done...

2012-02-16 05:44:22 by term paper (direct link | reply)

what a wonderful post....

2012-02-16 05:44:42 by research paper (direct link | reply)

please provide more posts.....

Leave a comment