Commit Graph

101 Commits

Author SHA1 Message Date
Justin Moore f90841ce8d Create database first 2024-10-18 13:48:53 -05:00
Justin Moore 5f00fa41eb Add in `docker-compose` files 2024-10-18 13:39:58 -05:00
Aleksei Voronov 8ddca6303d Bluesky blew through max int number of messages
Well, end of an era I guess. Time to use a bigger (bigint) cursor.

Interestingly enough, some of Rust parts were already using an i64,
but others weren't. Huh.
2024-10-18 20:04:27 +02:00
Aleksei Voronov 7eba3654f8 Handle missing repo errors properly
Apparently the error message has changed, and now we get missing repo errors,
not missing record errors for deleted profiles.

Let's handle it properly
2024-08-18 21:56:01 +02:00
Aleksei Voronov b7d4e8f73f Use correct type for query params in `getFeedSkeleton` endpoint
Using `Parameters` causes it not to deserialize the `limit` parameter correctly,
leading to 400 errors if it is specified at all.

`ParametersData` is the one we need.
2024-08-18 21:53:06 +02:00
Aleksei Voronov 1871f39331 Cross-compile using newer GCC
Building with default GCC included in `cross` at the moment (9.4.0) fails
on compiling `aws-lc-sys` with the following message:

    Your compiler (cc) is not supported due to a memcmp related bug reported in https://gcc.gnu.org/bugzilla/show_bug.cgi?id=95189.We strongly recommend against using this compiler.

GCC 10.5 seems to include a bugfix for this, but 9.4 does not.

Building with GCC 10.5 works.
2024-08-18 21:35:03 +02:00
Aleksei Voronov 3f79bad38f Add some instructions about cross-compiling
The configs are included, so might as well mention what to do
2024-08-18 20:59:10 +02:00
Aleksei Voronov b8d1fd7695 Upgrade everything to latest versions
This includes a bunch of small changes to adapt to how atrium-api has changed
over time. They're not functional or interesting, just some type-level
adjustments that are needed.

Some more complicated logic was changed in how profile details are parsed,
since atrium's way of doing things is weird and hard to understand so I just
manually grab stuff from the object map instead of relying on atrium's types.
This is similar to how CBOR parsing is done.

Boring maintenance stuff.
2024-08-18 14:27:23 +02:00
Aleksei Voronov 149cd44227 Add an index on Profile.has_been_processed
Who made the oldest mistake in database management history?

This guy.
2024-01-15 15:17:13 +01:00
Aleksei Voronov 10d4556ff3 Timeout if we haven't received any messages in 60 seconds
Sometimes, it seems, Bluesky just stops sending us messages. I do not know why.
Let's just try to timeout if that ever happens again?
2024-01-15 15:01:22 +01:00
Aleksei Voronov 1555a803e9 Upgrade dependencies
No breakage this time, nice
2024-01-15 14:15:07 +01:00
Aleksei Voronov be38e1e5a3 Handle cases when chatgpt doesn't respond with anything gracefully
It hasn't happened yet, but it's a ticking time bomb
2023-11-29 13:57:44 +01:00
Aleksei Voronov f6492fddc1 cargo clippy 2023-11-29 11:27:04 +01:00
Aleksei Voronov 51b5d6de71 Actually run all the components in parallel threads
I have much to learn about async Rust, but the gist of it is that running try_join
on a bunch of futures makes it run them in one task which is not at all what I want
because all the components here are separate from each other. Apparently to make it
spawn separate tasks, you need to use tokio::spawn. You live and learn.

Ideally of course these should be separate processes entirely, but for now I'm
far too lazy to deal with that
2023-11-29 11:24:55 +01:00
Aleksei Voronov 635e8506c6 Update cross-compile config to properly install openssl stuff
This is based on the recipe in Cross wiki[^1] and seems to work

[^1]: https://github.com/cross-rs/cross/wiki/Recipes#pre-build
2023-11-29 11:08:19 +01:00
Aleksei Voronov 3f979af5d8 Add a way to force profile country by DID
This is useful to fixup broken profiles.

Ideally, of course, we should be retrying a few times and then giving up,
but that seems kinda too much for a little hobby project.

Also, metrics would be nice, but, you know
2023-11-29 10:45:47 +01:00
Aleksei Voronov 85efe62fdf Mention which profile we could not classify in the error message 2023-11-29 10:39:48 +01:00
Aleksei Voronov 2bb88d69b3 Upgrade dependencies
- For new axum, use the new way to start the server. No other changes seem necessary.
- For new atrium, update the way agent is initialized. Also now we cannot get the
  session out of the agent, so resolve our own handle to the did with an extra request.
  This is a shame, but eh. That's what you get when using unstable libraries
2023-11-29 10:39:29 +01:00
Aleksei Voronov 77d2d90522 Acknowledge atrium-api in the README 2023-11-08 20:37:31 +01:00
Aleksei Voronov 8426bf7c8c Formatting 2023-11-06 08:59:28 +01:00
Aleksei Voronov c0c56627c1 Add some context to profile classifier errors
There's something broken with one of them atm and I'm not sure what, this will help
2023-11-06 08:58:59 +01:00
Aleksei Voronov 87dfb24e1a Update setup instructions
Point to the entire sql directory, not just the first file,
now that there are multiple things there
2023-11-06 08:55:19 +01:00
Aleksei Voronov 35ee1b0a1f Simplify Bluesky api usage
`atrium-api` now includes an `AtpAgent` which takes care of creating
and refreshing sessions automatically, so we no longer need our
custom xrpc client and session management logic.

This is nice.
2023-11-06 08:53:23 +01:00
Aleksei Voronov 524598a40b Upgrade to latest atrium-api
Some breakage there, but nothing major.

They also have AtpAgent now so maybe we can get rid of our custom session-refreshing thing?
2023-11-05 20:51:16 +01:00
Aleksei Voronov b0f9b9618c Update Firehose URL to bsky.network
This is the right thing to do at this moment as per Bluesky blog post[1]
asking developers to switch to the new URL which supports federation
properly.

[1]: https://atproto.com/blog/bgs-and-did-doc
2023-11-05 20:16:29 +01:00
Aleksei Voronov 419f72f3bb Store Bluesky firehose host as part of subscription state
The host has changed recently and the cursors are, apparently, incompatible with each
other, so we need to migrate to the new one, and this seems like the easiest way to
do it:

1. Store the host as part of the subscription state
2. Roll out the version that uses that new column as part of all queries
3. Switch to the new host
4. Roll out the version with the new host
   - The new instance will start processing messages with a 0 cursor and so start anew
   - The old instance will die off
2023-11-05 19:01:20 +01:00
Aleksei Voronov 3b03b11d58 Remove auth hack
atrium_api 0.9.1 forwards auth stuff properly to clients, so we don't need this anymore
2023-10-30 21:38:15 +01:00
Aleksei Voronov c7bceefc07 Upgrade dependencies 2023-10-30 19:06:59 +01:00
Aleksei Voronov 556f939774 Attempt to also index all posts made by posters that previously posted in Russian
I don't know if this will work well though, performance-wise.
It's basically going to now do a query per post, which may or may
not be a great idea
2023-10-16 12:10:12 +02:00
Aleksei Voronov 8ad19f6fa5 Formatting 2023-10-16 12:06:41 +02:00
Aleksei Voronov f0ca7e58e8 Move a comment to its proper place 2023-10-15 20:10:52 +02:00
Aleksei Voronov b5156ecfbf Remove unnecessary parameterization of Bluesky hosts
Really isn't necessary. I'm never going to use anything other than proper, real, production Bluesky for this.
2023-10-15 20:09:23 +02:00
Aleksei Voronov fddcf7272c Move profile details model into entities too 2023-10-15 19:59:05 +02:00
Aleksei Voronov e1baeffc6e Deserialize reply information
This may come in useful if I want to, you know, remove replies from the feed
2023-10-15 17:27:43 +02:00
Aleksei Voronov 892d600754 Log the entire post instance for debugging 2023-10-15 17:27:18 +02:00
Aleksei Voronov fad2283aa2 Lol, actually use the transaction that we make 2023-10-15 17:19:15 +02:00
Aleksei Voronov 96915ca986 Pass around bluesky objects instead of restructuring them everywhere
Makes thing a little simpler in many places, especially once I start adding more fields to posts
2023-10-15 11:55:25 +02:00
Aleksei Voronov 5eeb0e45b1 Restructure Bluesky-related code a bit
- Put internal stuff (cbor, ipld deserialization, xprc client) into internals module
- Move various record types into separate modules under entities
- Also move session into entities as well
- Simplify CBOR conversion stuff by liberal usage of TryFrom

This will all make it a little easier to implement additional things, like filtering out replies
2023-10-15 11:47:56 +02:00
Aleksei Voronov f008057f8a Upgrade dependencies because why not 2023-10-15 11:47:56 +02:00
Aleksei Voronov c498f9edd3
Create rust.yml 2023-10-09 16:13:25 +02:00
Aleksei Voronov 25373561b4
Update README.md
Add some details about dependencies used
2023-10-09 09:31:26 +02:00
Aleksei Voronov 6608071ef5
Update README.md
Add more details about setup, remove the unnecessary roadmap, add links to production deployment.
2023-10-09 09:17:36 +02:00
Aleksei Voronov f5e1d3e020 Formatting 2023-10-07 18:54:43 +02:00
Aleksei Voronov 768bb9f175 The feed parameter is an at-uri, not the feed name 2023-10-07 18:54:27 +02:00
Aleksei Voronov 6f8c86d815 Switch cursor-related messages to debug level
Otherwise it's too noisy
2023-10-07 18:27:25 +02:00
Aleksei Voronov 1bd843a05a Fix publishing feeds
This basically required implementing authentication from ground up
because atrium-api is horribly deficient when it comes to it,
providing basically no real way to manage it, and what is provided
is actually broken anyway requiring additional hacks to get around

Ah well. This has been the story of using anything in Rust that's
related to Bluesky. Everything is broken.
2023-10-07 18:26:20 +02:00
Aleksei Voronov 1e0e34b9a5 Fix the name of the service endpoint field in did.json 2023-10-07 18:24:04 +02:00
Aleksei Voronov 1d17c8b637 Make it able to run in production
- Remove the build Dockerfile, it's not useful on cheap VMs because you can't really build anything on them
- Update the serving address to be 0.0.0.0 so that it's actually exposed externally (127.0.0.1 isn't)
- Also update the port to be 3030 for no reason at all
- Add a Cross.toml config file for cross-compilation since my machine isn't exactly Linux that the resulting binary needs to run on
2023-10-06 20:24:58 +02:00
Aleksei Voronov dec35a867c Fix SQL syntax
Good God, Aleksei.
2023-10-05 20:38:22 +02:00
Aleksei Voronov 2062f0bb89 Add a nice optimized dockerfile for deployment 2023-10-05 20:20:06 +02:00