I have a loop that will search for my content everywhere, I also have another loop to search a specific sub, but I cannot find a way to do both filters in the same loop.

‘’‘for submission in reddit.subreddit(“sub”).hot(limit=10000)’‘’

‘’‘for comment in redditor.comments.hot(limit=10000):’‘’

The problem is the 1000 limit, if I try to refine the content with python while in the loop, then these 1000 results will miss the target, and 99% of those results will be comments of other people.

The result is that a lot of my comments won’t be touched. I can see a lot of it in search engines.

How did you do to remove as many comments as possible? I know you can also sort by controversial but I was wondering is there is a PRAW finesse that I could use here.

  • ubergeek77A
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    You can request a data export from Reddit which will have a more complete list of all your comments and all their IDs. Then you can feed those IDs into PRAW to do whatever you want. You can use Excel to filter it by sub, too.

    Don’t have any more specific advice since I haven’t done this in a few years, but hopefully just having the data export will help you out (assuming Reddit is even processing those right now).

  • abff08f4813c@kbin.social
    link
    fedilink
    arrow-up
    1
    ·
    1 year ago

    It’s a reddit limitation. There’s no index for your comments in a specific sub just indexes on your comments generally and on the sub itself.

    The way around this used to be to use the Pushshift api - which held a copy of everything and had no limits. Since that was shut down by reddit you now need to do more work.

    I recently outlined how to do this here, https://kbin.social/m/RedditMigration/t/65260/PSA-Here-s-exactly-what-to-do-if-you-hit-the

    • PabloDiscobar@kbin.socialOP
      link
      fedilink
      arrow-up
      1
      ·
      1 year ago

      Thanks, I did it yesterday… and I crashed my hard drive. It’s a 1.2TB archive and my ssd couldn’t even handle the IO of the empty entries alone.

      I was still able to count my posts in /r/linux, more than 1400. It goes fast.

      • abff08f4813c@kbin.social
        link
        fedilink
        arrow-up
        1
        ·
        1 year ago

        Ouch. Sorry about the crash. If you try it again keep in mind you can download individual subs without downloading the whole torrent, that’s what I did - and because of my specific subs I was able to keep it under 1 gig.