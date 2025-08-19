Building a Go Dependency Scanner From Scratch

Hackernoon
2025/08/19 17:36
NodeGO Token
GO$0.0003-42.30%

When managing Go projects, you need to track dependencies, check for vulnerabilities, and ensure license compliance. Instead of relying on external tools, let's build our own dependency analyzer using Go's standard library.

\

The Core Structure

We'll work with Go modules, so we need structures to represent them:

package main  import (     "bufio"     "encoding/json"     "fmt"     "io"     "net/http"     "os"     "regexp"     "sort"     "strings"     "time" )  type Module struct {     Path     string     Version  string     Indirect bool }  type GoMod struct {     Module   Module     Requires []Module }

\ Our tool will handle three operations: listing dependencies, vulnerability scanning, and license checking.

\

Parsing go.mod Files

Understanding Module File Structure

The go.mod file uses a specific format that we need to parse correctly. Module declarations start with the module keyword followed by the module path. Dependencies are listed in require statements, which can be single-line or grouped in multi-line blocks.

\ The parsing logic handles both formats by tracking whether we're inside a multi-line require block. We use regular expressions to extract the module path and version from each line, and detect indirect dependencies by looking for the // indirect comment. This approach gives us the same information that go list would provide, but without spawning external processes.

\ Rather than shelling out to go list, we can parse the go.mod file directly:

func parseGoMod() (*GoMod, error) {     file, err := os.Open("go.mod")     if err != nil {         return nil, fmt.Errorf("go.mod not found: %v", err)     }     defer file.Close()      goMod := &GoMod{         Requires: []Module{},     }      scanner := bufio.NewScanner(file)     inRequire := false     requireRegex := regexp.MustCompile(`^\s*([^\s]+)\s+([^\s]+)(?:\s+//\s*indirect)?`)     moduleRegex := regexp.MustCompile(`^module\s+(.+)`)      for scanner.Scan() {         line := strings.TrimSpace(scanner.Text())          if strings.HasPrefix(line, "module ") {             if matches := moduleRegex.FindStringSubmatch(line); len(matches) > 1 {                 goMod.Module = Module{Path: matches[1]}             }         }          if strings.HasPrefix(line, "require (") {             inRequire = true             continue         }          if inRequire && line == ")" {             inRequire = false             continue         }          if inRequire || strings.HasPrefix(line, "require ") {             cleanLine := strings.TrimPrefix(line, "require ")             if matches := requireRegex.FindStringSubmatch(cleanLine); len(matches) >= 3 {                 module := Module{                     Path:     matches[1],                     Version:  matches[2],                     Indirect: strings.Contains(line, "indirect"),                 }                 goMod.Requires = append(goMod.Requires, module)             }         }     }      return goMod, scanner.Err() }

\ The parser handles both single-line requires and multi-line require blocks. It extracts module paths, versions, and identifies indirect dependencies.

\

Vulnerability Database Queries

How Vulnerability Checking Works

Vulnerability databases maintain records of known security issues in software packages. Each vulnerability gets assigned identifiers like CVE numbers and includes details about affected versions. The process works like this: we send the package name and version to the database API, it checks if that specific version has any known vulnerabilities, then returns a list of issues if found.

\ The OSV database is particularly useful because it aggregates vulnerability data from multiple sources and provides a unified API. When we query it, we're essentially asking "does this exact version of this package have any reported security problems?" The database performs version matching and returns structured data about any findings.

\ We can check the OSV (Open Source Vulnerabilities) database for known issues:

func checkOSVDatabase(modulePath, version string) []string {     url := "https://api.osv.dev/v1/query"      payload := map[string]interface{}{         "package": map[string]string{             "name":      modulePath,             "ecosystem": "Go",         },         "version": version,     }      jsonData, err := json.Marshal(payload)     if err != nil {         return []string{}     }      client := &http.Client{Timeout: 10 * time.Second}     resp, err := client.Post(url, "application/json", strings.NewReader(string(jsonData)))     if err != nil {         return []string{}     }     defer resp.Body.Close()      if resp.StatusCode != 200 {         return []string{}     }      var result struct {         Vulns []struct {             ID      string `json:"id"`             Summary string `json:"summary"`         } `json:"vulns"`     }      if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {         return []string{}     }      var vulnerabilities []string     for _, vuln := range result.Vulns {         vulnStr := fmt.Sprintf("%s: %s", vuln.ID, vuln.Summary)         vulnerabilities = append(vulnerabilities, vulnStr)     }      return vulnerabilities }  func checkVulnerabilities() {     goMod, err := parseGoMod()     if err != nil {         fmt.Printf("Error: %v\n", err)         return     }      vulnerableModules := 0      for i, mod := range goMod.Requires {         fmt.Printf("\rScanning %d/%d: %s", i+1, len(goMod.Requires), mod.Path)         vulns := checkOSVDatabase(mod.Path, mod.Version)         if len(vulns) > 0 {             vulnerableModules++             fmt.Printf("\n🚨 %s@%s:\n", mod.Path, mod.Version)             for _, vuln := range vulns {                 fmt.Printf("  - %s\n", vuln)             }         }     }      if vulnerableModules == 0 {         fmt.Println("\n✅ No known vulnerabilities found")     } else {         fmt.Printf("\n⚠️ Found %d vulnerable modules\n", vulnerableModules)     } }

\ The vulnerability checker sends a JSON payload with the module name and version, then parses the response for any reported vulnerabilities.

\

License Information Fetching

How License Detection Works

License compliance checking involves identifying what legal terms govern each dependency in your project. Most open source projects include license files in their repositories, and platforms like GitHub parse these files to identify the license type using SPDX identifiers.

\ Our approach leverages GitHub's license detection API, which analyzes repository contents and returns standardized license identifiers. For modules hosted on GitHub, we extract the owner and repository name from the module path, then query GitHub's API endpoint that specifically provides license information. This gives us machine-readable license data without having to download and parse license files ourselves.

\ Different licenses have different requirements , some like MIT are very permissive, while others like GPL have copyleft requirements that might affect how you can distribute your software. Understanding these differences is crucial for legal compliance.

\ For GitHub-hosted modules, we can get license data from their API:

func fetchGitHubLicense(owner, repo string) string {     url := fmt.Sprintf("https://api.github.com/repos/%s/%s/license", owner, repo)      client := &http.Client{Timeout: 10 * time.Second}     resp, err := client.Get(url)     if err != nil {         return "Unknown"     }     defer resp.Body.Close()      if resp.StatusCode != 200 {         return "Unknown"     }      var result struct {         License struct {             SPDXID string `json:"spdx_id"`         } `json:"license"`     }      if err := json.NewDecoder(resp.Body).Decode(&result); err != nil {         return "Unknown"     }      if result.License.SPDXID != "" && result.License.SPDXID != "NOASSERTION" {         return result.License.SPDXID     }      return "Unknown" }  func fetchLicenseFromRepo(modulePath string) string {     if strings.HasPrefix(modulePath, "golang.org/x/") {         return "BSD-3-Clause"     }      if !strings.HasPrefix(modulePath, "github.com/") {         return "Unknown"     }      parts := strings.Split(modulePath, "/")     if len(parts) < 3 {         return "Unknown"     }      return fetchGitHubLicense(parts[1], parts[2]) }  func checkLicenses() {     goMod, err := parseGoMod()     if err != nil {         fmt.Printf("Error: %v\n", err)         return     }      licenseCount := make(map[string]int)      for i, mod := range goMod.Requires {         fmt.Printf("\rProcessing %d/%d...", i+1, len(goMod.Requires))         license := fetchLicenseFromRepo(mod.Path)         licenseCount[license]++         fmt.Printf("\r  %s: %s\n", mod.Path, license)     }      fmt.Println("\nLicense Distribution:")     for license, count := range licenseCount {         percentage := float64(count) / float64(len(goMod.Requires)) * 100         fmt.Printf("  %s: %d modules (%.1f%%)\n", license, count, percentage)     } }

\ The license checker recognizes that golang.org/x/ packages use BSD-3-Clause, then queries GitHub's API for other repositories.

\

Dependency Analysis with Checksum Verification

The dependency analyzer lists modules and verifies their integrity using go.sum:

func parseGoSum() map[string]string {     checksums := make(map[string]string)      file, err := os.Open("go.sum")     if err != nil {         return checksums     }     defer file.Close()      scanner := bufio.NewScanner(file)     for scanner.Scan() {         parts := strings.Fields(scanner.Text())         if len(parts) >= 3 {             module := parts[0] + "@" + parts[1]             checksums[module] = parts[2]         }     }      return checksums }  func analyzeDependencies() {     goMod, err := parseGoMod()     if err != nil {         fmt.Printf("Error: %v\n", err)         return     }      checksums := parseGoSum()      fmt.Printf("Module: %s\n", goMod.Module.Path)     fmt.Printf("Found %d dependencies:\n\n", len(goMod.Requires))      direct, indirect := 0, 0      sort.Slice(goMod.Requires, func(i, j int) bool {         return goMod.Requires[i].Path < goMod.Requires[j].Path     })      for _, mod := range goMod.Requires {         status := "direct"         if mod.Indirect {             status = "indirect"             indirect++         } else {             direct++         }          checksumKey := mod.Path + "@" + mod.Version         hasChecksum := "❌"         if _, exists := checksums[checksumKey]; exists {             hasChecksum = "✅"         }          fmt.Printf("  %s %s@%s (%s)\n", hasChecksum, mod.Path, mod.Version, status)     }      fmt.Printf("\nSummary: %d direct, %d indirect dependencies\n", direct, indirect) }

\

Command Interface

The main function routes commands to the appropriate handlers:

func main() {     if len(os.Args) < 2 {         fmt.Println("Usage: go run main.go <command>")         fmt.Println("Commands:")         fmt.Println("  deps      List all dependencies")         fmt.Println("  vulns     Check for vulnerabilities")         fmt.Println("  licenses  Check license compliance")         os.Exit(1)     }      switch os.Args[1] {     case "deps":         analyzeDependencies()     case "vulns":         checkVulnerabilities()     case "licenses":         checkLicenses()     default:         fmt.Println("Unknown command")         os.Exit(1)     } }

\

Running the Tool

Save the code as main.go and run it in any Go project:

# List dependencies with checksum verification go run main.go deps

\

# Scan for vulnerabilities go run main.go vulns

\

\

# Analyze licenses go run main.go licenses

\

\ The output shows dependency information, vulnerability reports, and license distribution across your project's dependencies. The tool demonstrates how dependency analysis works behind the scenes, parsing module files, querying public APIs, and cross-referencing data sources.

\ This implementation covers the basic concepts but is just a starting point. Real vulnerability scanning requires comprehensive databases, sophisticated version range matching, false positive filtering, and robust error handling. License compliance tools need legal policy engines, compatibility matrices, and custom license detection beyond what GitHub's API provides. For production use, you'd want multiple data sources, caching, rate limiting, and much more thorough validation logic.

\ Happy coding ;)

\ You can find source code here https://github.com/rezmoss/go-dependency-scanner

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact [email protected] for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

Wyoming Launches First State-Issued Stablecoin, FRNT

Wyoming Launches First State-Issued Stablecoin, FRNT

The post Wyoming Launches First State-Issued Stablecoin, FRNT appeared first on Coinpedia Fintech News Wyoming has launched the Frontier Stable Token (FRNT), becoming the first U.S. state to issue its own stablecoin. Backed by USD and short-term U.S. Treasuries, FRNT is now live on seven major blockchains, including Ethereum, Solana, and Polygon. Despite this milestone, the token is not yet available for public trading due to ongoing regulatory challenges. …
U
U$0.0208-5.45%
TokenFi
TOKEN$0.01429-1.03%
Major
MAJOR$0.16253+0.96%
PUBLIC
PUBLIC$0.05857+3.15%
Notcoin
NOT$0.00188-0.58%
Share
CoinPedia2025/08/19 19:37
Y Combinator's Youngest Solo Founder Says Digital Identity Is The Internet's Biggest Infrastructure

Y Combinator's Youngest Solo Founder Says Digital Identity Is The Internet's Biggest Infrastructure

Kirill Avery, Y Combinator's youngest solo founder, warns that digital identity is the internet's biggest crisis. With cybercrime hitting $10.5 trillion by 2025 and bots making up half of internet traffic, traditional verification methods miss the real problem: distinguishing AI agents acting for humans versus malicious bots. His team was recently fooled by an AI-assisted job candidate who passed interviews but was fired within a week. AI companies are repeating Big Tech's playbook of centralizing data without transparency, while government regulations like UK/EU age verification create surveillance infrastructure instead of privacy-preserving solutions. Without proper decentralized identity systems, the combination of AI integration and CBDC rollouts could enable unprecedented government control over individuals.
RealLink
REAL$0.05168+5.98%
Wink
LIKE$0.011898-3.81%
Sleepless AI
AI$0.1183-1.49%
Sologenic
SOLO$0.35138+3.24%
Share
Hackernoon2025/08/19 17:53
Training Tesseract OCR on Kurdish Historical Documents

Training Tesseract OCR on Kurdish Historical Documents

This article documents the process of digitizing Kurdish historical publications and training Tesseract OCR to recognize the language. The team sourced rare archives from the Zheen Center, processed fragile scans into clean line-by-line images, and created a ground-truth dataset of over 1,200 files. Using Ubuntu and tesstrain, they set up a training environment, corrected image skew, applied cropping, and built transcription pairs to teach the model Kurdish text recognition. The results showcase how open-source OCR tools can help preserve cultural heritage through machine learning.
SuperRare
RARE$0.05722+5.45%
Imagen Network
IMAGE$0.00119-3.25%
OpenGPU
OPEN$0.0000000713-16.51%
Share
Hackernoon2025/08/19 16:00

Trending News

More

Wyoming Launches First State-Issued Stablecoin, FRNT

Y Combinator's Youngest Solo Founder Says Digital Identity Is The Internet's Biggest Infrastructure

Training Tesseract OCR on Kurdish Historical Documents

Dreigt Dogecoin nu écht in te storten in 2025?

KUNGFUVERSE released the first NFT collection KUNGFU BEASTS