klibs-io

Project Url: JetBrains/klibs-io
Introduction: Search Kotlin Multiplatform projects and packages
More: Author   ReportBugs   OfficialWebsite   
Tags:

Context

This project was started outside of JetBrains as a PoC and was never meant to be a serious project.

Initially, the website pages were served by the backend (SSR with Thymeleaf), but before the project was transferred to JetBrains, it was re-implemented to be an independent backend with REST API. Last commit with SSR: f4e46939.

While most of the project was refactored to look consistent and more like a decent codebase, you may still see traces of the early PoC stage with cut corners, hacks and TODOs.

Modules

The structure of the project tries to follow the "module by feature" approach, encapsulating distinct parts of the app in separate modules.

  • app - the main server module. Contains high-level configurations for the whole app. Serves as glue for all other modules. Runnable.

Core modules represent the essential parts of the app:

  • core/package - Maven Packages (arfitacts). For example, kotlinx-coroutines-core
  • core/project - Most high-level aggregating entity. Maps to an scm-repository, consists of a number of packages.
  • core/scm-owner - Owners of scm-repository entities, be it organizations or individual authors. For example, github.com/Kotlin
  • core/scm-repository - (Git) repositories of project entities. For example, github.com/Kotlin/kotlinx.coroutines.
  • core/search - Search functionality across all data (projects, packages, owners, repositories)

Third party integrations reside separately:

  • integrations/ai - Integration with OpenAI. For example, for generating descriptions of libraries.
  • integrations/github - Integration with GitHub to collect info for scm-repository and scm-owner
  • integrations/maven - Integration Maven Central to scan for new package entities.

Profiles & configuration properties

Spring profiles are used to run the app in different environments. Profile-specific configuration properties can be found in app/src/main/resources.

The prod profile is used to run the app in production, it hides or restricts some debug utilities. A separate local profile can be used for testing the app locally.

Note: application-prod.yml is just a template, it contains configuration properties that need to be configured for the app to work. Adapt it and/or use externalized configuration if necessary.

Scanning of Maven Central and indexing of packages can be enabled/disabled by setting the klibs.indexing property.

Files on disk

This app needs to store files on disk:

  • Cache of requests to GitHub's API, managed by OkHttp. Helps with avoiding rate limits. Configuration property: klibs.integration.github.cache.request-cache-path.
  • README files from GitHub, both in Markdown and in HTML. These are stored in S3. Configuration properties: klibs.readme.mode, klibs.readme.cache-dir, klibs.readme.s3.bucket-name, klibs.readme.s3.prefix.

Build & Run

Boot Jar

Run

./gradlew bootJar

Output: app/build/libs/app.jar

Run locally

Note: we use docker compose to run the app locally. So you need to have docker installed.

You can run the main function from Application it will loads spring boot application with local profile.

Troubleshooting

In case of problems, check troubleshooting.md.

Endpoints

Swagger

Swagger API is available under /api-docs/swagger-ui.html

Actuator

Spring Actuator is used.

  • /actuator/health - has custom health indicators
  • /actuator/info - has custom info contributors

Workflow

You can find information about the development workflow in workflow.md.

Implementation details

Indexing logic

This is by far the most confusing part of the whole backend.

The general flow:

  1. Check for new artifacts (published since the last check) using Maven Central's API.
  2. If new artifacts are available, add them to the processing queue (table package_index_request)
  3. Process the package indexing queue in a separate thread, one by one. If indexing of a package fails, increment its failed_attempts. Try to process each package up to N times. Projects and SCM owner/info are created in the process of indexing packages.

AI descriptions are generated by a separate scheduled task because the rate limits of OpenAI are much lower than of GitHub and Maven Central, so it's significantly slower.

Information taken from GitHub (repository/owner) is updated by a separate scheduled task too, based on github_repo.updated_at

As of this moment, PostgresSQL's Full Text Search is used for FTS. All relevant data is aggregated in a single materialized view project_index, which is updated periodically and is used for search queries. While it gets the job done, it leaves a lot to be desired.

At some point, FTS might need to be re-implemented to use Solr / ElasticSearch or something similar. Code-wise, it shouldn't be too difficult because all search-related logic is contained in the search module, so hopefully it's just a matter of re-implementing SearchRepository.

This is probably the biggest technical task (the rest of the tech debt is less scary)

How to update JVM version

There are 3 places, which should be updated:

  1. Build logic module toolchain version: build.gradle.kts
  2. Toolchain version in base jvm convention plugin: klibs.kotlin-jvm.gradle.kts
  3. Gradle daemon jvm version. Update jvm version in task updateDaemonJvm: build.gradle.kts and run updateDaemonJvm task:
./gradlew updateDaemonJvm 

Gradle Build Scans

Gradle Build Scans can provide insights into an klibs.io backend Build. JetBrains runs a Gradle Develocity server that can be used to automatically upload reports.

To automatically opt in add the following to $GRADLE_USER_HOME/gradle.properties.

io.klibs.build.scan.enabled=true
# optionally provide a username that will be attached to each report
io.klibs.build.scan.username=John Wick

Also, you need to create an access key:

./gradlew provisionDevelocityAccessKey

A Build Scan may contain identifiable information. See the Terms of Use https://gradle.com/legal/terms-of-use/.

Apps
About Me
GitHub: Trinea
Facebook: Dev Tools