Data Platform

Platform lead: Dr. Finlay McAlister

We provide a collaborative opportunity to generate patient-oriented, health-systems research involving secondary analysis of existing large data sets. Our focus is on Alberta’s administrative databases. We have statistical, methodological and database expertise, and can work with you to design a research plan, manage data, develop and execute biostatistical plan to answer the research question and generate training opportunities.

We work closely with the Methods Support & Development Platform and the Consultation & Research Services Platform to provide a variety of opportunities to generate and complete informative studies.

Services we offer

Dataset feasibility

We can work with you to determine what data are available to assist with answering the research questions at hand. We can participate by working out the data sets required, and assist with access to the data through the repository owners. We can provide information on the time periods and update cycles for databases and variables which is useful when planning the timelines of your research study. You might have a clear idea of what you would like to use a secondary dataset for, but require assistance with the appropriate handling of the variables or the statistical methodology relevant to the disease state and question. We can work with you in preparing for undertaking your research or in preparing a grant application.

Biostatistical plan development

The analysis of secondary data use data has some particular data strategies and methods. We are happy to collaborate with you to determine the correct statistical plan to best test your hypothesis, and maximize the impact of the analysis. We can work with you in the planning stage so that this can be part of a grant application.

Case volume estimates

In running studies with primary data collection, it is essential to know how many patients to recruit. With secondary use data, the question is often flipped. We can help determine how many patients there are with a specific procedure or diagnosis in either a specific region or the entire province.

Sample size calculations

Sample size calculations are an important consideration in undertaking a study, however it is important to consider more than null rates and clinically important differences. We can help generate a working sample size that considers the disease, patients, and standard treatment. These types of calculations can be important in both primary data collection studies as well as those that rely on secondary data, particularly in cases where there may not be a huge case volume (rare conditions/procedures), or other treatment considerations.

Consultation on what types of data are available can be very useful. Secondary data typically has some limitations in terms of what is coded and how. It is important to figure this out early so that the research addresses the right question.

With our access to available administrative data sources, we can work with you to determine the best way to answer the research question and run some pilot data to look at how it is coded, case volumes etc.

Pilot data extraction

During the feasibility stage of your research project, we can complete pilot test data pulls to find out more about the data that will be received and to create a plan and process for aggregating the data into one data set.

Preparation of data access agreements

In order to access data from the province and other sources, data agreements with repository owners must be signed-off which detail the responsibilities of the research team. This can be a complex process and the research team will need to provide detailed information on the purpose and use of the requested data. The Data Platform can assist in this process.

Data extraction

With our access to reliable high quality administrative data, we can help your project by bringing in data from a number of different information systems across the province linked as necessary. In order to access this service, ethics approvals are required and we can assist with data access agreements.

Data linking and merging

Generating a meaningful answer to a research question may require data from several sources. Linking the datasets together can be a time consuming task particularly if the data is delivered in different formats. We can link the data together to provide one seamless database for a comprehensive analysis.

Generation of control groups

We can pull administrative data which collects information on certain variables which you can use as control group data within the research study. Access to the larger administrative data sets allows us to go further and consider age/sex or similar matches.

Data cleaning

No dataset is perfect and there will be incomplete and inaccurate records within the dataset. Inaccurate records are identified when parameters are entered that are outside the acceptable range. Missing data maybe simply blank, coded as a nine or some other number. All of these factors can cause issues when analyzing the dataset, so the data must be cleaned. We can help with this.

Data analyses

We can collaborate with your team to provide complete statistical analyses of the data if your group doesn’t have the in-house experience with administrative data. We can work collaboratively on this and dissemination of the results.

Outcomes modelling

Health care outcomes are often a result of many factors. We can help develop appropriate modelling strategies to explore the research question accounting for some of the important factors that may influence the outcome.

Data visualization

We can help generate table and graphical representations to clearly communicate information about the data and analysis.

More information about the Alberta SPOR SUPPORT Unit:

The Alberta SPOR SUPPORT Unit is jointly funded by Alberta Innovates and the Canadian Institutes of Health Research.