Only train on pages on which document.interestCohort is called #33

dmarti · 2020-11-25T19:31:55Z

In order to limit inadvertent sensitive group tagging, only train the FLoC classifier on the URL or content of pages on which document.interestCohort has been called. If the owner of a page wants to make it available for training but ignore the cohort, they can ignore the return value of the function.

The web currently has more than 1.2 billion sites (including parked domains). It is impractical for even a large browser developer to test for which patterns of usage of which sites are inadvertently revealing sensitive information about a user.

For example, web history on an ordinary-looking web-based game could result in providing inputs to the machine learning algorithm that train it to recognize a set of users with a specific disability that affects their gameplay, and expose that set of users as a cohort to any site they visit -- without revealing to the users affected that their cohort reveals this sensitive information to all those sites.

Patterns of usage of a set of general-interest world history or culture sites could result in training the algorithm to recognize people with specific political or religious concerns, again without revealing to the people affected that their cohort is flagging them as a likely member of a protected or at-risk group.

Many other patterns of emergent sensitive group tagging would likely only become evident only after FLoC has been deployed to real-world users with real web histories.

Source: https://news.netcraft.com/archives/category/web-server-survey/

Related issue: Publisher opt-out ? (Issue #13 covers an explicit opt-out that would remain in effect even if a script on the page later calls document.interestCohort)

The text was updated successfully, but these errors were encountered:

joshuakoran · 2020-11-25T19:48:55Z

The post above raises a good point about the web author (publisher) expectations as to the control they ought to have over the operations of their web business.

The post also raises a good point about people's expectations. To expand on this second issue, people have real concerns around the attributes associated with their web client being used to harm them (e.g., embarrass people with sensitive health conditions or to deny them health insurance -- "a set of users with a specific disability that affects their gameplay, and expose that set of users as a cohort to any site they visit").

As we look to improve the web for people, it would be good to emphasize how proposals are protecting people from harms such as these, since they are unrelated to the existence of cross-origin IDs and instead associated with how data (attributes) are used to match content to people.

As has been discussed at TPAC and IWA BG, many of these issues are policy matters rather than technical ones and hence we can do a better job of delineating how proposals are improving the web for which stakeholders and the (unintended) impact to other stakeholders that may result.

jkarlin · 2020-11-25T19:52:11Z

Thanks for opening the issue @dmarti . We're actively considering whether FLoC should be opt-in or opt-out, where this is a likely opt-in scenario. I think we'd also need a top-level opt-out option as well, so that a page that had some third-party use the interestCohort API wouldn't unknowingly or unwillingly be opted in. The question then becomes what happens to the API call if the page is opted out? My preference is to return an empty string in that case.

dmarti · 2020-11-25T20:50:32Z

Yes, it makes sense that if the page is opted out and any script on the page calls interestCohort, it should get an empty string.

Sounds like an example of reciprocity -- if you want to use FLoC, you pay in to FLoC by allowing training. This would give some needed discretion to sites to use FLoC responsibly, so they could check with their own requirements (and update privacy policies if necessary) before turning it on.

ph00lt0 · 2021-03-07T12:56:12Z

The opt-in decision is not only to be made by the publisher. It should be the end users decision to allow evil tracking technologies. FloC is in essence not anonymous, therefor it should not automatically opt-in users by any means. Ignoring that would result is an undemocratic system.

dmarti · 2021-03-18T20:01:24Z

Issue #61 points out that a browser extension will be able to obtain the user's cohort by injecting a script that calls interestCohort. There are legit reasons for an extension to be able to get the cohort (see #17). However an extension might inject such a script into a page that does not already call interestCohort and opt the page into FLoC training.

michaelkleber mentioned this issue Feb 11, 2021

Availability for experimentation #25

Closed

dmarti mentioned this issue Feb 15, 2021

Virtuous Incentives / Compensation to join FLoC? #45

Open

dmarti mentioned this issue Nov 4, 2021

Add HTTP header to opt out of "interest cohort" training mdn/yari#3159

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Only train on pages on which document.interestCohort is called #33

Only train on pages on which document.interestCohort is called #33

dmarti commented Nov 25, 2020 •

edited

Loading

joshuakoran commented Nov 25, 2020

jkarlin commented Nov 25, 2020

dmarti commented Nov 25, 2020 •

edited

Loading

ph00lt0 commented Mar 7, 2021

dmarti commented Mar 18, 2021

Only train on pages on which document.interestCohort is called #33

Only train on pages on which document.interestCohort is called #33

Comments

dmarti commented Nov 25, 2020 • edited Loading

joshuakoran commented Nov 25, 2020

jkarlin commented Nov 25, 2020

dmarti commented Nov 25, 2020 • edited Loading

ph00lt0 commented Mar 7, 2021

dmarti commented Mar 18, 2021

dmarti commented Nov 25, 2020 •

edited

Loading

dmarti commented Nov 25, 2020 •

edited

Loading