• Tim@lemmy.snowgoons.ro
    link
    fedilink
    English
    arrow-up
    19
    ·
    2 days ago

    Annnnnd that’s why I downloaded a snapshot of Wikipedia a few months ago and host it locally.

    Sad that it’s necessary, but with modern AI tooling, we have everything we need to destroy knowledge on an industrial scale.

      • Truscape@lemmy.blahaj.zone
        link
        fedilink
        English
        arrow-up
        8
        arrow-down
        1
        ·
        2 days ago

        Wikipedia has guides for it; Check the downloading wikipedia section. The most popular offline client atm is Kiwix reader

      • Tim@lemmy.snowgoons.ro
        link
        fedilink
        English
        arrow-up
        1
        ·
        1 day ago

        Kiwix is the easiest way to do it; if you have Docker/Kubernetes, there’s a Docker image at ghcr.io/kiwix/kiwix-serve, and the K8s manifest to deploy is as simple as:

        apiVersion: v1
        kind: Service
        metadata:
          name: wikipedia-service
        spec:
          selector:
            app: kiwix-server
          ports:
          - port: 80
            targetPort: 8080
          clusterIP: None
        ---
        apiVersion: apps/v1
        kind: Deployment
        metadata:
          name: wikipedia-server
          labels:
            app: kiwix-server
        spec:
          replicas: 1
          selector:
            matchLabels:
              app: kiwix-server
          template:
            metadata:
              name: wikipedia-server
              labels:
                app: kiwix-server
            spec:
              containers:
              - name: kiwix-server
                image: kiwix/kiwix-serve:3.8.0
                imagePullPolicy: IfNotPresent
                command:
                - /usr/local/bin/kiwix-serve
                - --port=8080
                - --verbose
                - /data/wikipedia_en_all_maxi.zim
                ports:
                - containerPort: 8080
                  protocol: TCP
                volumeMounts:
                - name: data
                  mountPath: /data
                  readOnly: true
                  limits:
                    memory: "128Mi"
                    cpu: "2000m"
              volumes:
              - name: data
                persistentVolumeClaim:
                  claimName: wikipedia-mirror
        

        Then you just need to download a copy of the mirror file wikipedia_en_all_maxi.zim and put it in the appropriate place - wget https://download.kiwix.org/zim/wikipedia_en_all_maxi.zim