Skip to content

Commit

Permalink
Add broken internal links checker to docsite tests (#3736)
Browse files Browse the repository at this point in the history
Couldn't find any convenient ones on google that were easy to customize
(e.g. whitelisting links, files, etc.) so I just wrote one myself using
JSoup

Will add a broken external links checker in a follow up
  • Loading branch information
lihaoyi authored Oct 14, 2024
1 parent fc03373 commit 5d0c8bf
Show file tree
Hide file tree
Showing 11 changed files with 78 additions and 16 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/run-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@ jobs:
- uses: actions/checkout@v4
with: { fetch-depth: 0 }

- run: ./mill -i docs.githubPages
- run: ./mill -i docs.githubPages + docs.checkBrokenLinks

linux:
needs: build-linux
Expand Down
4 changes: 2 additions & 2 deletions contrib/playlib/readme.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -128,8 +128,8 @@ object core extends PlayApiModule {

== Play configuration options

The Play modules themselves don't have specific configuration options at this point but the <<router-configuration-options,router
module configuration options>> and the <<_twirl_configuration_options>> are applicable.
The Play modules themselves don't have specific configuration options at this point but the <<_router_configuration_options,router
module configuration options>> and the <<contrib/twirllib.adoc#_twirl_configuration_options>> are applicable.

== Additional play libraries

Expand Down
2 changes: 1 addition & 1 deletion docs/modules/ROOT/pages/depth/evaluation-model.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,7 @@ of. In general, we have found that having "two places" to put code - outside of

The hard boundary between these two phases is what lets users easily query
and visualize their module hierarchy and task graph without running them: using
xref:scalalib/builtin-commands.adoc#inspect[inspect], xref:scalalib/builtin-commands.adoc#plan[plan],
xref:scalalib/builtin-commands.adoc#_inspect[inspect], xref:scalalib/builtin-commands.adoc#_plan[plan],
xref:scalalib/builtin-commands.adoc#_visualize[visualize], etc.. This helps keep your
Mill build discoverable even as the `build.mill` codebase grows.

Expand Down
2 changes: 1 addition & 1 deletion docs/modules/ROOT/pages/extending/import-ivy-plugins.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ include::partial$example/extending/imports/2-import-ivy-scala.adoc[]
== Importing Plugins

Mill plugins are ordinary JVM libraries jars and are loaded as any other external dependency with
the xref:extending/import-ivy-plugins.adoc[`import $ivy` mechanism].
the `import $ivy` mechanism.

There exist a large number of Mill plugins, Many of them are available on GitHub and via
Maven Central. We also have a list of plugins, which is most likely not complete, but it
Expand Down
4 changes: 2 additions & 2 deletions docs/modules/ROOT/pages/fundamentals/query-syntax.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ There are two kind of segments: _label segments_ and _cross segments_.
_Label segments_ are the components of a task path and have the same restriction as Scala identifiers.
They must start with a letter and may contain letters, numbers and a limited set of special characters `-` (dash), `_` (underscore).
They are used to denote Mill modules, tasks, but in the case of xref:fundamentals/modules.adoc#external-modules[external modules] their Scala package names.
They are used to denote Mill modules, tasks, but in the case of xref:fundamentals/modules.adoc#_external_modules[external modules] their Scala package names.
_Cross segments_ start with a label segment but contain additional square brackets (`[`, `]`]) and are used to denote cross module and their parameters.
Expand Down Expand Up @@ -133,7 +133,7 @@ There is a subtile difference between the expansion of <<enumerations,enumeratio

For all the former versions, Mill parses them into a complex but single task selector path and subsequent parameters are used for all resolved tasks.

Whereas the `+` start a completely new selector path to which you can also provide a different parameter list. This is important when using xref:fundamentals/tasks.adoc#commands[command tasks] which can accept their own parameters. The `JavaModule.run` command is an example.
Whereas the `+` start a completely new selector path to which you can also provide a different parameter list. This is important when using xref:fundamentals/tasks.adoc#_commands[command tasks] which can accept their own parameters. The `JavaModule.run` command is an example.

----
> mill foo.run hello # <1>
Expand Down
4 changes: 2 additions & 2 deletions docs/modules/ROOT/partials/Installation_IDE_Support.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -389,7 +389,7 @@ automatically open a pull request to update your Mill version (in
`.mill-version` or `.config/mill-version` file), whenever there is a newer version available.

TIP: Scala Steward can also
xref:scalalib/module-config.adoc#_keeping_up_to_date_with_scala_steward[scan your project dependencies]
xref:scalalib/dependencies.adoc#_keeping_up_to_date_with_scala_steward[scan your project dependencies]
and keep them up-to-date.

=== Development Releases
Expand All @@ -401,7 +401,7 @@ https://github.com/com-lihaoyi/mill/releases[available] as binaries named
`+#.#.#-n-hash+` linked to the latest tag.

The easiest way to use a development release is to use one of the
<<_bootstrap_scripts>>, which support <<_overriding_mill_versions>> via an
<<_bootstrap_scripts>>, which support overriding Mill versions via an
`MILL_VERSION` environment variable or a `.mill-version` or `.config/mill-version` file.


2 changes: 1 addition & 1 deletion docs/modules/ROOT/partials/Intro_to_Mill_Header.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ digraph G {
{mill-github-url}[Mill] is a fast multi-language JVM build tool that supports {language}, making your
common development workflows xref:comparisons/maven.adoc[5-10x faster to Maven], or
xref:comparisons/gradle.adoc[2-4x faster than Gradle], and
xref:comparisons/sbt[easier to use than SBT].
xref:comparisons/sbt.adoc[easier to use than SBT].
Mill aims to make your JVM project's build process performant, maintainable, and flexible
even as it grows from a small project to a large codebase or monorepo with hundreds of modules:

Expand Down
67 changes: 64 additions & 3 deletions docs/package.mill
Original file line number Diff line number Diff line change
@@ -1,9 +1,9 @@
package build.docs
import org.jsoup._
import mill.util.Jvm
import mill._, scalalib._
import de.tobiasroeser.mill.vcs.version.VcsVersion
import guru.nidi.graphviz.engine.AbstractJsGraphvizEngine
import guru.nidi.graphviz.engine.{Format, Graphviz}
import collection.JavaConverters._

/** Generates the mill documentation with Antora. */
object `package` extends RootModule {
Expand Down Expand Up @@ -205,7 +205,7 @@ object `package` extends RootModule {
| sources:
| - url: ${if (authorMode) build.baseDir else build.Settings.projectUrl}
| branches: []
| tags: ${build.Settings.legacyDocTags.map("'" + _ + "'").mkString("[", ",", "]")}
| tags: ${build.Settings.legacyDocTags.filter(_ => !authorMode).map("'" + _ + "'").mkString("[", ",", "]")}
| start_path: docs/antora
|
|${taggedSources.mkString("\n\n")}
Expand Down Expand Up @@ -265,6 +265,7 @@ object `package` extends RootModule {
T.log.outputStream.println(
s"You can browse the local pages at: ${(pages.path / "index.html").toNIO.toUri()}"
)
pages
}

def generatePages(authorMode: Boolean) = T.task { extraSources: Seq[os.Path] =>
Expand Down Expand Up @@ -346,4 +347,64 @@ object `package` extends RootModule {
}
}
}

def allLinksAndAnchors: T[IndexedSeq[(os.Path, Seq[(String, String)], Seq[(String, String)], Set[String])]] = Task {
val base = fastPages().path
val validExtensions = Set("html", "scala")
for (path <- os.walk(base) if validExtensions(path.ext))
yield {
val parsed = Jsoup.parse(os.read(path))
val (remoteLinks, localLinks) = parsed
.select("a")
.asScala
.map(e => (e.toString, e.attr("href")))
.toSeq
.partition{case (e, l) => l.startsWith("http://") || l.startsWith("https://")}
(
path,
remoteLinks,
localLinks.map{case (e, l) => (e, l.stripPrefix("file:"))},
parsed.select("*").asScala.map(_.attr("id")).filter(_.nonEmpty).toSet,
)
}
}
def checkBrokenLinks() = Task.Command{
if (brokenLinks().nonEmpty){
throw new Exception("Broken Links: " + upickle.default.write(brokenLinks(), indent = 2))
}
}
def brokenLinks: T[Map[os.Path, Seq[(String, String)]]] = Task{
val allLinksAndAnchors0 = allLinksAndAnchors()
val pathsToIds = allLinksAndAnchors()
.map{case (path, remoteLinks, localLinks, ids) => (path, ids)}
.toMap

val brokenLinksPerPath: Seq[(os.Path, Seq[(String, String)])] =
for ((path, remoteLinks, localLinks, ids) <- allLinksAndAnchors0) yield{
(
path,
localLinks.flatMap{case (elementString, url) =>
val (baseUrl, anchorOpt) = url match {
case s"#$anchor" => (path.toString, Some(anchor))
case s"$prefix#$anchor" => (prefix, Some(anchor))

case url => (url, None)
}

val dest0 = os.Path(baseUrl, path / "..")
val possibleDests = Seq(dest0, dest0 / "index.html")
possibleDests.find(os.exists(_)) match{
case None => Some((elementString, url))
case Some(dest) =>
anchorOpt.collect{case a if !pathsToIds.getOrElse(dest, Set()).contains(a) => (elementString, url)}
}
}
)
}

val nonEmptyBrokenLinksPerPath = brokenLinksPerPath
.filter{ case (path, items) => path.last != "404.html" && items.nonEmpty }

nonEmptyBrokenLinksPerPath.toMap
}
}
2 changes: 1 addition & 1 deletion example/fundamentals/tasks/2-primary-tasks/build.mill
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
// There are three primary kinds of _Tasks_ that you should care about:
//
// * <<_sources>>, defined using `Task.Sources {...}`
// * <<_tasks>>, defined using `Task {...}`
// * <<_cached_tasks>>, defined using `Task {...}`
// * <<_commands>>, defined using `Task.Command {...}`

// === Sources
Expand Down
2 changes: 1 addition & 1 deletion example/fundamentals/tasks/4-inputs/build.mill
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ def myInput = Task.Input {
// arbitrary block of code.
//
// Inputs can be used to force re-evaluation of some external property that may
// affect your build. For example, if I have a <<_cached_task, cached task>> `bar` that
// affect your build. For example, if I have a xref:#_cached_tasks[cached task] `bar` that
// calls out to `git` to compute the latest commit hash and message directly,
// that target does not have any `Task` inputs and so will never re-compute
// even if the external `git` status changes:
Expand Down
3 changes: 2 additions & 1 deletion mill-build/build.sc
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ object `package` extends MillBuildRootModule {
// TODO: implement empty version for ivy deps as we do in import parser
ivy"com.lihaoyi::mill-contrib-buildinfo:${mill.api.BuildInfo.millVersion}",
ivy"com.goyeau::mill-scalafix::0.4.1",
ivy"com.lihaoyi::mill-main-graphviz:${mill.api.BuildInfo.millVersion}"
ivy"com.lihaoyi::mill-main-graphviz:${mill.api.BuildInfo.millVersion}",
ivy"org.jsoup:jsoup:1.12.1"
)
}

0 comments on commit 5d0c8bf

Please sign in to comment.