Document models-as-data barriers and barrier guards and add change notes#21523
Document models-as-data barriers and barrier guards and add change notes#21523owen-mc wants to merge 19 commits intogithub:mainfrom
Conversation
There was a problem hiding this comment.
Pull request overview
Adds documentation and release notes describing how to model Models-as-Data (MaD) data-flow barriers and barrier guards across multiple CodeQL language libraries.
Changes:
- Added per-language change notes announcing support for MaD barriers/barrier guards in data extensions.
- Updated multiple “Customizing library models …” guides to list
barrierModel/barrierGuardModelextensibles and provide examples. - Added new barrier/barrier-guard example sections in several language guides (for example, Java/Go/Python/Ruby/JavaScript/C#/C++).
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| rust/ql/lib/change-notes/2026-03-20-data-extensions-barriers.md | Adds Rust library change note announcing barrier/barrier-guard support. |
| ruby/ql/lib/change-notes/2026-03-20-data-extensions-barriers.md | Adds Ruby library change note with a link to the Ruby modeling guide. |
| python/ql/lib/change-notes/2026-03-20-data-extensions-barriers.md | Adds Python library change note with a link to the Python modeling guide. |
| javascript/ql/lib/change-notes/2026-03-20-data-extensions-barriers.md | Adds JavaScript library change note with a link to the JavaScript modeling guide. |
| java/ql/lib/change-notes/2026-03-20-data-extensions-barriers.md | Adds Java library change note with a link to the Java/Kotlin modeling guide. |
| go/ql/lib/change-notes/2026-03-20-data-extensions-barriers.md | Adds Go library change note with a link to the Go modeling guide. |
| csharp/ql/lib/change-notes/2026-03-20-data-extensions-barriers.md | Adds C# library change note with a link to the C# modeling guide. |
| cpp/ql/lib/change-notes/2026-03-20-data-extensions-barriers.md | Adds C/C++ library change note with a link to the C/C++ modeling guide. |
| docs/codeql/codeql-language-guides/customizing-library-models-for-ruby.rst | Documents Ruby barrierModel / barrierGuardModel and adds examples. |
| docs/codeql/codeql-language-guides/customizing-library-models-for-python.rst | Documents Python barrierModel / barrierGuardModel and adds examples. |
| docs/codeql/codeql-language-guides/customizing-library-models-for-javascript.rst | Documents JavaScript barrierModel / barrierGuardModel and adds examples. |
| docs/codeql/codeql-language-guides/customizing-library-models-for-java-and-kotlin.rst | Documents Java/Kotlin barrierModel / barrierGuardModel and adds examples. |
| docs/codeql/codeql-language-guides/customizing-library-models-for-go.rst | Documents Go barrierModel / barrierGuardModel and adds examples. |
| docs/codeql/codeql-language-guides/customizing-library-models-for-csharp.rst | Documents C# barrierModel / barrierGuardModel and adds examples. |
| docs/codeql/codeql-language-guides/customizing-library-models-for-cpp.rst | Documents C/C++ barrierModel / barrierGuardModel and adds examples. |
| --- | ||
| category: feature | ||
| --- | ||
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see `Customizing library models for C# <https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-csharp/>`__. |
There was a problem hiding this comment.
This is a Markdown change note, but the “Title <url>__” construct is reStructuredText and is currently wrapped in backticks, so it will render as inline code (not as a link). Please switch to a Markdown link format so the docs URL renders correctly.
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see `Customizing library models for C# <https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-csharp/>`__. | |
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see [Customizing library models for C#](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-csharp/). |
| --- | ||
| category: feature | ||
| --- | ||
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see `Customizing library models for Python <https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-python/>`__. |
There was a problem hiding this comment.
This is a Markdown change note, but the “Title <url>__” construct is reStructuredText and is currently wrapped in backticks, so it will render as inline code (not as a link). Please switch to a Markdown link format so the docs URL renders correctly.
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see `Customizing library models for Python <https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-python/>`__. | |
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see [Customizing library models for Python](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-python/). |
| - ``barrierModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model barriers, which are elements that stop the flow of taint. | ||
| - ``barrierGuardModel(namespace, type, boolean subtypes, name, signature, ext, input, acceptingvalue, kind, provenance)``. This is used to model barrier guards, which are elements that can stop the flow of taint depending on a conditional check. |
There was a problem hiding this comment.
These signatures use namespace for the first parameter and boolean subtypes for the third, but the Java/Kotlin MaD extensible predicates use package as the first column and the third column is named subtypes (a boolean flag), consistent with sourceModel/sinkModel/summaryModel. Please align the parameter names here to avoid misleading readers.
| - ``barrierModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model barriers, which are elements that stop the flow of taint. | |
| - ``barrierGuardModel(namespace, type, boolean subtypes, name, signature, ext, input, acceptingvalue, kind, provenance)``. This is used to model barrier guards, which are elements that can stop the flow of taint depending on a conditional check. | |
| - ``barrierModel(package, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model barriers, which are elements that stop the flow of taint. | |
| - ``barrierGuardModel(package, type, subtypes, name, signature, ext, input, acceptingvalue, kind, provenance)``. This is used to model barrier guards, which are elements that can stop the flow of taint depending on a conditional check. |
| - ``sinkModel(namespace, type, subtypes, name, signature, ext, input, kind, provenance)``. This is used to model sinks where tainted data may be used in a way that makes the code vulnerable. | ||
| - ``summaryModel(namespace, type, subtypes, name, signature, ext, input, output, kind, provenance)``. This is used to model flow through elements. | ||
| - ``barrierModel(namespace, type, subtypes, name, signature, ext, output, kind, provenance)``. This is used to model barriers, which are elements that stop the flow of taint. | ||
| - ``barrierGuardModel(namespace, type, boolean subtypes, name, signature, ext, input, acceptingvalue, kind, provenance)``. This is used to model barrier guards, which are elements that can stop the flow of taint depending on a conditional check. |
There was a problem hiding this comment.
In the predicate signature list, the third parameter is named subtypes elsewhere (and described as a boolean flag in examples). Using boolean subtypes here is inconsistent and reads like a typo; consider keeping the parameter name subtypes for consistency with the other signatures.
| - ``barrierGuardModel(namespace, type, boolean subtypes, name, signature, ext, input, acceptingvalue, kind, provenance)``. This is used to model barrier guards, which are elements that can stop the flow of taint depending on a conditional check. | |
| - ``barrierGuardModel(namespace, type, subtypes, name, signature, ext, input, acceptingvalue, kind, provenance)``. This is used to model barrier guards, which are elements that can stop the flow of taint depending on a conditional check. |
| --- | ||
| category: feature | ||
| --- | ||
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see `Customizing library models for Ruby <https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-ruby/>`__. |
There was a problem hiding this comment.
This is a Markdown change note, but the “Title <url>__” construct is reStructuredText and is currently wrapped in backticks, so it will render as inline code (not as a link). Please switch to a Markdown link format so the docs URL renders correctly.
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see `Customizing library models for Ruby <https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-ruby/>`__. | |
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see [Customizing library models for Ruby](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-ruby/). |
| --- | ||
| category: feature | ||
| --- | ||
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see `Customizing library models for Java and Kotlin <https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-java-and-kotlin/>`__. |
There was a problem hiding this comment.
This is a Markdown change note, but the “Title <url>__” construct is reStructuredText and is currently wrapped in backticks, so it will render as inline code (not as a link). Please switch to a Markdown link format so the docs URL renders correctly.
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see `Customizing library models for Java and Kotlin <https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-java-and-kotlin/>`__. | |
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see [Customizing library models for Java and Kotlin](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-java-and-kotlin/). |
| --- | ||
| category: feature | ||
| --- | ||
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see `Customizing library models for Go <https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-go/>`__. |
There was a problem hiding this comment.
This is a Markdown change note, but the “Title <url>__” construct is reStructuredText and is currently wrapped in backticks, so it will render as inline code (not as a link). Please switch to a Markdown link format so the docs URL renders correctly.
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see `Customizing library models for Go <https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-go/>`__. | |
| * Data flow barriers and barrier guards can now be added using data extensions. For more information see [Customizing library models for Go](https://codeql.github.com/docs/codeql-language-guides/customizing-library-models-for-go/). |
3ab2153 to
052b41a
Compare
| - The fifth value ``()`` is the method input type signature. | ||
|
|
||
| The sixth value should be left empty and is out of scope for this documentation. | ||
| The remaining values are used to define the ``access path``, the ``accepting value``, the ``kind``, and the ``provenance`` (origin) of the barrier guard. |
There was a problem hiding this comment.
I find it weird that "access path" and "accepting value" are quoted as code, but they contain spaces, so they're not identifiers. Occurs elsewhere as well.
There was a problem hiding this comment.
Maybe it's best just to treat these four terms as natural language terms and just not quote them.
There was a problem hiding this comment.
It seems to be a convention that column names are quoted this way. It's used in lots of the docs about MaD. I don't think we should change the convention. We could use hyphens instead of spaces in column names.
There was a problem hiding this comment.
I've pushed a commit to use hyphens for access-path and accepting-value.
aschackmull
left a comment
There was a problem hiding this comment.
A few comments, otherwise generally LGTM. But I guess we'll want a docs review for this.
9977b1c to
260a1a0
Compare
yoff
left a comment
There was a problem hiding this comment.
Python 👍 One comment, but not blocking.
| if url_has_allowed_host_and_scheme(url, allowed_hosts=...): # The check guards the use of 'url', so it is safe. | ||
| redirect(url) # This is safe. | ||
|
|
||
| We need to add a tuple ``(type, path, branch, kind)`` to the ``barrierGuardModel`` extensible predicate by updating a data extension file (the extension ID is implicit and auto-assigned). |
There was a problem hiding this comment.
I think the extension ID has not been mentioned anywhere at this point, is it simply confusing to talk about it here?
There was a problem hiding this comment.
Thanks for spotting that. It's definitely confusing. I've removed everything in parentheses in the commit Remove mention of extension ID.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
260a1a0 to
f79ffe7
Compare
|
I rebased on |
|
I'm opening a release so we can post a changelog for this change as it was a big ask from our customers. Can we time this to go public next Tuesday? (We get Tuesdays as the designated day for security releases) |
|
In some sense this is already public (all the development has been done in the open). I assume you mean merging this doc PR on Tuesday. (Or is there some action needed to update the public docs? I am not sure.) That is fine with me, assuming it gets a doc review by then. (My first request was missed somehow, but it's on their radar now.) |
There was a problem hiding this comment.
@owen-mc Looks good overall! Sorry again about the mixup that delayed us reviewing this. We've created an issue to look into why it didn't get added to our review board.
I just noticed a few small formatting things - the main thing was there was a duplicate sentence in most of the examples - it least it felt confusing to me so thought it could be removed. However, if it is there for a reason, feel free to ignore!
| data: | ||
| - ["", "", False, "is_safe", "", "", "Argument[*0]", "true", "sql-injection", "manual"] | ||
|
|
||
| Since we are adding a barrier guard, we need to add a tuple to the ``barrierGuardModel`` extensible predicate. |
There was a problem hiding this comment.
| Since we are adding a barrier guard, we need to add a tuple to the ``barrierGuardModel`` extensible predicate. |
This seems a duplicate sentence to me since we already said we are adding a tuple above the code block, but ignore if it is needed!
| data: | ||
| - ["", "", False, "mysql_real_escape_string", "", "", "Argument[*1]", "sql-injection", "manual"] | ||
|
|
||
| Since we are adding a barrier, we need to add a tuple to the ``barrierModel`` extensible predicate. |
There was a problem hiding this comment.
| Since we are adding a barrier, we need to add a tuple to the ``barrierModel`` extensible predicate. |
This seems a duplicate sentence to me since we already said we are adding a tuple above the code block, but ignore if it is needed!
| data: | ||
| - ["my-package", "Member[isValid].Argument[0]", "true", "sql-injection"] | ||
|
|
||
| - Since we are adding a barrier guard, we need to add a tuple to the **barrierGuardModel** extensible predicate. |
There was a problem hiding this comment.
| - Since we are adding a barrier guard, we need to add a tuple to the **barrierGuardModel** extensible predicate. | |
| Since we are adding a barrier guard, we need to add a tuple to the **barrierGuardModel** extensible predicate. |
Think this can be just a sentence, the unordered list below is each column
| data: | ||
| - ["global", "Member[encodeURIComponent].ReturnValue", "html-injection"] | ||
|
|
||
| - Since we are adding a barrier, we need to add a tuple to the **barrierModel** extensible predicate. |
There was a problem hiding this comment.
| - Since we are adding a barrier, we need to add a tuple to the **barrierModel** extensible predicate. | |
| Since we are adding a barrier, we need to add a tuple to the **barrierModel** extensible predicate. |
This can just be a sentence, and everything below is the unordered list of columns
| data: | ||
| - ["Validator!", "Method[is_safe].Argument[0]", "true", "sql-injection"] | ||
|
|
||
| - Since we are adding a barrier guard, we need to add a tuple to the **barrierGuardModel** extensible predicate. |
There was a problem hiding this comment.
| - Since we are adding a barrier guard, we need to add a tuple to the **barrierGuardModel** extensible predicate. | |
| Since we are adding a barrier guard, we need to add a tuple to the **barrierGuardModel** extensible predicate. |
| - ["java.io", "File", True, "getName", "()", "", "ReturnValue", "path-injection", "manual"] | ||
|
|
||
|
|
||
| Since we are adding a new barrier, we need to add a tuple to the ``barrierModel`` extensible predicate. |
There was a problem hiding this comment.
| Since we are adding a new barrier, we need to add a tuple to the ``barrierModel`` extensible predicate. |
Duplicate sentence to what is above the code block?
| - ["java.net", "URI", True, "isAbsolute", "()", "", "Argument[this]", "false", "request-forgery", "manual"] | ||
|
|
||
|
|
||
| Since we are adding a barrier guard, we need to add a tuple to the ``barrierGuardModel`` extensible predicate. |
There was a problem hiding this comment.
| Since we are adding a barrier guard, we need to add a tuple to the ``barrierGuardModel`` extensible predicate. |
Duplicate sentence to what is above the code block?
| data: | ||
| - ["System.Web", "HttpRequest", False, "get_RawUrl", "()", "", "ReturnValue", "url-redirection", "manual"] | ||
|
|
||
| Since we are adding a barrier, we need to add a tuple to the ``barrierModel`` extensible predicate. |
There was a problem hiding this comment.
| Since we are adding a barrier, we need to add a tuple to the ``barrierModel`` extensible predicate. |
Duplicate sentence to what is above the code block?
| data: | ||
| - ["System", "Uri", False, "get_IsAbsoluteUri", "()", "", "Argument[this]", "false", "url-redirection", "manual"] | ||
|
|
||
| Since we are adding a barrier guard, we need to add a tuple to the ``barrierGuardModel`` extensible predicate. |
There was a problem hiding this comment.
| Since we are adding a barrier guard, we need to add a tuple to the ``barrierGuardModel`` extensible predicate. |
Duplicate sentence to what is above the code block?
Co-authored-by: Sarita Iyer <66540150+saritai@users.noreply.github.com>
To avoid copy-paste mistakes and make them more consistent we just use the word "model".
Co-authored-by: Copilot <copilot@github.com>
|
Thanks for your review. I agree about the duplication. It is copying a pattern that is already used in the files, so I've fixed it in those other places too. I've also tried to make the guides for the different languages use more consistent wording for introducing examples. And I've gone as far as switching the three that were using bold to use monospace instead. |
|
@saritai Thanks for re-reviewing. One final question: once this is merged, does the public documentation automatically get updated? If so, how long does it take? (I admit that that was 2 questions...) We want to announce this feature on Tuesday so it would be good to make sure the documentation is updated on that day. |
Note that there isn't yet a docs page called "Customizing library models for Rust" for me to add examples of models-as-data barriers or barrier guards to.
Models-as-data sanitizers haven't been added for actions because it doesn't have any barriers and for swift because doesn't have any models-as-data.