XRay, first step in understanding how personal data is being used on web services

New tool makes online personal data more transparent

8/18/14 Columbia Engineering researchers develop XRay, first step in understanding how personal data is being used on web services like Google, Amazon, and YouTube

New York, NY—August 18, 2014—The web can be an opaque black box: it leverages our personal information without our knowledge or control. When, for instance, a user sees an ad about depression online, she may not realize that she is seeing it because she recently sent an email about being sad. Roxana Geambasu and Augustin Chaintreau, both assistant professors of computer science at Columbia Engineering, are seeking to change that, and in doing so bring more transparency to the web. Along with their PhD student, Mathias Lecuyer, the researchers have developed XRay, a new tool that reveals which data in a web account, such as emails, searches, or viewed products, are being used to target which outputs, such as ads, recommended products, or prices. They will be presenting the prototype, which is designed to make the online use of personal data more transparent, at USENIX Security on August 20. The researchers have posted the open source system, as well as their findings, online for other researchers interested in studying how web services use personal data to leverage and extend.
“Today we have a problem: the web is not transparent. We see XRay as an important first step in exposing how websites are using your personal data,” says Geambasu, who is also a member of Columbia’s Institute for Data Sciences and Engineering’s Cybersecurity Center.
We live in a “big data” world, where staggering amounts of personal data—our locations, search histories, emails, posts, photos, and more—are constantly being collected and analyzed by Google, Amazon, Facebook, and many other web services. While harnessing big data can certainly improve our daily lives (Amazon offerings, Netflix suggestions, emergency response Tweets, etc.), these beneficial uses have also generated a big data frenzy, with web services aggressively pursuing new ways to acquire and commercialize the information.
“It’s critical, now more than ever, to reconcile our privacy needs with the exponential progress in leveraging this big data,” says Chaintreau, a member of the Institute for Data Sciences and Engineering’s New Media Center. Geambasu adds, “If we leave it unchecked, big data’s exciting potential could become a breeding ground for data abuses, privacy vulnerabilities, and unfair or deceptive business practices.”
Determined to provide checks and balances on data abuse, XRay is designed to be the first fine-grained, scalable personal data tracking system for the web. For example, one can use the XRay prototype to study why a user might be shown a specific ad in Gmail. Geambasu and Chaintreau found, for example, that a Gmail user who sees ads about various forms of spiritualism might have received them because he or she sent an email message about depression.
Developing XRay was challenging, say the researchers. “The science of understanding the use of personal web data at a fine grain—looking at individual emails, photos, posts, etc.—is largely non-existent,” Geambasu notes. “There really isn’t anything out there that can accurately pinpoint which specific input—which search query, visited site, or viewed product—or combination of inputs explains which output. It was clear that we needed to come up with a new, robust auditing tool, one that can be applied effectively to many different services.”
How it Works
“We knew from the start that our biggest challenge in achieving transparency would be scale—how do we continue to track more data while using minimum resources?” Chaintreau says. “The theoretical results were encouraging, but seemed too good to be true. So we tested XRay in actual situations, learning from experiments we ran on Gmail, Amazon, and YouTube, and refining the design multiple times. The final design surprised us: XRay succeeded in all the experiments we ran, and it matched our theoretical predictions in increasingly complex cases. That is when we finally thought that achieving web transparency at large is not a dream in a distant future but something we can start building toward now.”
The current XRay system works with Gmail, Amazon, and YouTube. However, XRay’s core functions are service-agnostic and easy to instantiate for new services, and they can track data within and across services. The key idea in XRay is to use black-box correlation of data inputs and outputs to detect data use.
To assess XRay’s practical value, the researchers created an XRay-based demo service that continuously collects and diagnoses Gmail ads related to a set of topics, including various diseases, pregnancy, race, sexual orientation, divorce, debt, etc. They created emails that included keywords closely related to one topic and then launched XRay’s Gmail ad collection and examined the targeting associations. XRay’s data is now available online to anyone interested in sensitive-topic ad targeting in Gmail.
“We’ve just started to peek into XRay’s targeting data and even at this early stage, we’ve seen a lot of interesting behaviors,” Geambasu says. “We know that we need larger-scale experience to formalize and quantify our conclusions, but we can already make several interesting observations.”
The researchers note that (1) It is definitely possible to target sensitive topics in users’ inboxes, including cancer, depression, or pregnancy. (2) For many ads, targeting was extremely obscure and non-obvious to end-users, which opens them up to abuses. (3) The researchers have already seen signs of such abuses, for instance, a number of subprime loan ads for used cars targeting debt in users’ inboxes. Examples of ads and their targeted topics can be found on the XRay website.
The tool can be used to increase user awareness about how their data is being used, as well as provide much needed tools for auditors, such as researchers, journalists, and investigators, to keep that use under scrutiny. Geambasu and Chaintreau, who recently won a Magic Grant from the Brown institute for Media Innovation to build better transparency tools, have made the XRay prototype available for auditors at http://xray.cs.columbia.edu.
“Our work calls for and promotes the best practice of voluntary transparency,” says Chaintreau, “while at the same time empowering investigators and watchdogs with a significant new tool for increased vigilance, something we need more of every day.”

#
Big Data, Internet Surveillance, and 4th Amendment.
Who watches the watchers? Big Data goes unchecked Washington


Educational CyberPlayGround: Surveillance Technology in the Work Place .
networking would presumably be the domain of PRISM. How Much Big


Parents fight Big Data Collection and Surveillance
Education Surveillance and Big Data Big Data If you’re OK with


Privacy Concerns over selling K-12 Student Datainformation is a common practice.
Department has been a major proponent of big data . It has used


Educational CyberPlayGround: Knowledge Management
Crunch your way through big data on your iPad? Lucky Sort , is


Learn how to protect your privacy when you are online.
The ethics of data and power. BIG DATA The data all of these


Educational CyberPlayGround: State Associations of School Administrators Alabama ,…
Who watches the watchers? Big Data goes unchecked Washington
Educational CyberPlayGround: Future Trends in Computing.
makezine.com/ 2012 BIG DATA When scientists publish their


Federal K12 Department of Education in America FAIL.
Education Surveillance and Big Data Is your company product or


Predictive Technology – Darpa develops cognitive fingerprint.
Big Data And You: How Your ‘Likes’ Reveal Sexuality, Race, Drug

 

Facebook emotion study examined by Canadian privacy commissioner

Facebook emotion study examined by Canadian privacy commissioner

http://www.cbc.ca/news/business/facebook-emotion-study-examined-by-privacy-commissioner-1.2695145
“… European regulators are probing the matter, with the U.K.’s Information Commissioner’s Office working with counterparts in France and Ireland (where Facebook’s European operations are located) to get more details on the study.”

PROBLEMS WITH FACEBOOK
How to report a problem AND get an underage kid off facebook.

Dept. of Ed Funding Opportunities

FUNDING OPPORTUNITIES
 
The Department is currently seeking applications for the Elementary and Secondary School Counseling Program, the Advanced Placement (AP) Test Fee Program, and the Turnaround School Leaders Program.  The school counseling program provides funding to districts to establish or expand school counseling programs, with special consideration given to applicants that can: demonstrate the greatest need for counseling services in the schools to be served; propose most innovative and promising approaches; and show the greatest potential for replication and dissemination.  Applications are due April 28.  The test fee program awards grants to states to enable them to pay all or a portion of AP test fees on behalf of low-income students.  Applications are due May 8.  The school leadership program supports projects to develop and implement or enhance and implement a critical leadership pipeline that selects, prepares, places, supports, and retains school leaders for School Improvement Grant (SIG) schools or SIG-eligible schools.  Applications are due May 23.
 
Moreover, for the current fiscal year, the Department’s Office of Innovation and Improvement (OII) is conducting13 grant competitions across five program areas: Arts in Education, Charter Schools, Investing in Innovation (i3), Full-Service Community Schools, and Teacher Quality Partnerships.  Four of the competitions are already underway.  Announcements of the other competitions are slated for later this spring.
 
Also, be sure to review the Department’s FY 2014 Grants Forecast (as of March 31), which lists virtually all programs and competitions under which the agency has invited or expects to invite applications for awards and provides actual or estimated dates for the transmittal of applications under these programs. (Note: This document is advisory only and not an official application notice of the Department of Education.)

Skills for the New Economy: Preparing Students for College and Careers

Skills for the New Economy: Preparing Students for College and Careers
http://www2.ed.gov/about/overview/budget/budget15/crosscuttingissues/skillsforneweconomy.pdf
 
RETHINKING HIGH SCHOOL
 
On April 7, during his visit to Bladensburg High School in Prince George’s County, Maryland, President Obama announced 24 Youth CareerConnect grants, providing $107 million to local partnerships of school districts, institutions of higher education, workforce investment boards, and employers as they redesign the teaching and learning experience for youth to more fully prepare them with the knowledge, skills, and industry-relevant education needed to get on the pathway to a successful career, including postsecondary education or registered apprenticeship.  “We challenged America’s high schools to…say what they can do to make sure their students learn the skills that businesses are looking for in high-demand fields,” the President said.  “And we asked high schools to develop partnerships with colleges and employers and create classes that focus on real life applications for the fields of the future — fields like science and technology and engineering and math….  The winners across the board are doing the kinds of things that will allow other schools to start duplicating what they’re doing…. And that’s what we want for all the young people here.  We want an education that engages you…that equips you with the rigorous and relevant skills for college and for a career” (blog post, with remarks and video).
 
The Youth CareerConnect program was established this year by the Labor Department, in collaboration with the Education Department, using one-time revenues from the H-1B visa program.  Grants range from $2.2 million to $7 million.  The program wholly complements additional proposals in the President’s Fiscal Year 2015 budget to ensure high school students graduate ready for college and career success and to help the U.S., once again, lead the world in college attainment.
 
Bladensburg High School was part of a three-school team from the county that won a $7 million grant.  It offers several career academies with high school curricula aligned with college-level entrance requirements for Maryland’s state university system.  Through a collaborative effort with local partners, it will expand the capacity of its Health and Biosciences Academy to better prepare more students for one of the region’s highest growth industries.  Students who concentrate in health professions will be able to earn industry-recognized certifications in the fields of nursing and pharmacy.  Biomedical students will be able to earn college credit from the University of Maryland at Baltimore County and the Rochester Institute of Technology.  All students will have access to individualized college and career counseling designed to improve preparation for college-level coursework and the attainment of industry-recognized credentials.  Students will also have the ability to receive postsecondary credit while still in high school and have access to paid work experiences with employer partners such as Lockheed Martin.  Overall, the grant will help prepare 2,500 graduates at Bladensburg and other schools across the county to succeed academically and graduate career-ready in the high-demand fields of health care and information technology.
 
On the same day, the Departments of Education and Labor launched the Registered Apprenticeship-College Consortium, a new effort that will allow graduates of registered apprenticeship programs to turn their years of rigorous on-the-job and classroom training into college credits toward an associate’s or bachelor’s degree.  Registered apprenticeship programs are sponsored by joint employer and labor groups, individual employers, or employer associations.  Currently, the registered apprenticeship system includes a network of more than 19,000 programs nationwide — offering nearly 1,000 different career opportunities.  Participating sponsors will have their programs evaluated by a third-party organization (for example, the American Council on Education or the National College Credit Recommendation Service) to determine the college credit value of the apprenticeship completion certificate.  Graduates will be able to earn up to 60 credits based on their apprenticeship experience.