The Mid–Hudson Valley Community Profiles project is divided into 8 main topics: Children and Youth, Community Engagement, Demographics, Economy, Education, Financial Stability, Health, and Housing.
Within each topic are specific indicators, each with data displayed in tables and charts, analysis, and information about data definitions, limitations of the data, and sources. Each main topic also has an overview section summarizing key trends and highlights of the data.
The Center for Governmental Research (CGR) collected and analyzed the best available data from national, state, county and local agencies. Whenever possible, CGR used New York State sources for data rather than data from local sources to ensure consistent definitions and reporting and to enable consistent and reliable comparisons across counties. The data sources are listed on the data tables and charts for each indicator.
To make the data meaningful, CGR performs several conversions. These include converting raw numbers to rates (often by creating per–capita figures using population data from the U.S. Census Bureau, discussed in more detail below). CGR also adjusted dollar figures for inflation to provide a reasonable basis for comparisons.
Data were not always available for every geographic area for every indicator. In some cases, national data were not available or were not available in comparable form. For example, differences in how and when data are collected by federal and state agencies can mean that figures are not comparable. In those cases, CGR presents data for the project counties and for New York State, but not for the nation because the numbers are not comparable.
In addition, data are not always available or presented for small groups. For example, data for some very small racial or ethnic groups are not presented, such as American Indians, Alaskan Natives, Native Hawaiians, and the like. The reason for this is the populations are so small that data about them are sometimes not valid or not useful in identifying trends.
In order to suggest that a trend exists there must be a clear pattern of consistent movement in the same direction over several years. Caution should be exercised in drawing conclusions based on fluctuations in data from one year to the next. Such one–year fluctuations, even if substantial, typically are not sufficiently reliable for planning and assessment purposes.
CGR presents data up to the most recent year that reliable data are available from the source when the project is conducted. In some cases, data may be available for a general indicator but not for a specific breakdown of the same data. This is true, for example, of state test score data. When passing rates for 2009 are available, the latest year for passing rates by students’ economic status is 2008 and the latest year for passing rates by students’ racial/ethnic background is 2007.
CGR has been very careful in collecting, analyzing, and presenting data from a variety of sources. Although CGR has judged its data sources to be reliable, it was not possible to authenticate all data. If careful users of the website discover data errors or typographical errors, CGR welcomes this feedback and will incorporate corrections into the data updates. Please contact CGR directly with such information at (585) 325–6360 or by email.
This project uses three types of Census data: information from the 2000 decennial census, data from the Population Estimates program, and information from the American Community Survey. Data from and information about all three sources within the Census can be accessed through the U.S. Census Bureau's Factfinder website.
The decennial census occurs every 10 years to collect information about the people and housing of the United States. This project uses data released from the 2000 Census with data from the 2010 Census expected to be available through a series of releases from 2011 through 2013. The decennial census is the official count of everyone in the United States, and it is used to divide the 435 seats in the House of Representatives among the states. Census data is used in innumerable other ways, including in distributing federal funding among states and localities and in many types of research.
The Census Bureau's Population Estimates Program publishes population numbers between censuses to estimate by how much populations are growing or shrinking. The Population Estimates Program develops and prepares estimates of the population by age, sex, race, and Hispanic origin for the nation, states and counties. The estimates use information from government records of births, deaths, domestic and international migration, and other sources in order to estimates changes in populations.
The Census Bureau's American Community Survey (ACS) is a nationwide survey designed to provide communities a fresh look at how they are changing. The ACS collects information such as age, race, income, commute time to work, home value, veteran status, and other important data. This project uses the latest data covering the five years spanning 2005 through 2009. These data are used to describe and track characteristics of the population — including income, poverty, educational attainment and more — that used to be collected only during the decennial census. The bureau combined five years of responses to the survey to provide estimates for smaller geographic areas (those with 20,000 or more residents) and to increase the precision of its estimates.
However, because the information came from a survey, the samples responding to the survey were not always large enough to produce reliable results, especially in small geographic areas. CGR has noted on data tables the estimates with relatively large margins of error. Estimates with three asterisks have the largest margins, plus or minus 50% or more of the estimate. Two asterisks mean plus or minus 35%–50%, and one asterisk means plus or minus 20%–35%. For all estimates, the confidence level is 90%, meaning there is 90% probability the true value (if the whole population were surveyed) would be within the margin of error (or confidence interval).
The data in this project presented for New York State is typically presented without data for New York City, and this is noted. This is because the state outside of New York City was believed to be more comparable to the project counties than the entire state.