Aleksei Voronov
556f939774
Attempt to also index all posts made by posters that previously posted in Russian
...
I don't know if this will work well though, performance-wise.
It's basically going to now do a query per post, which may or may
not be a great idea
2023-10-16 12:10:12 +02:00
Aleksei Voronov
8ad19f6fa5
Formatting
2023-10-16 12:06:41 +02:00
Aleksei Voronov
f0ca7e58e8
Move a comment to its proper place
2023-10-15 20:10:52 +02:00
Aleksei Voronov
b5156ecfbf
Remove unnecessary parameterization of Bluesky hosts
...
Really isn't necessary. I'm never going to use anything other than proper, real, production Bluesky for this.
2023-10-15 20:09:23 +02:00
Aleksei Voronov
fddcf7272c
Move profile details model into entities too
2023-10-15 19:59:05 +02:00
Aleksei Voronov
e1baeffc6e
Deserialize reply information
...
This may come in useful if I want to, you know, remove replies from the feed
2023-10-15 17:27:43 +02:00
Aleksei Voronov
892d600754
Log the entire post instance for debugging
2023-10-15 17:27:18 +02:00
Aleksei Voronov
fad2283aa2
Lol, actually use the transaction that we make
2023-10-15 17:19:15 +02:00
Aleksei Voronov
96915ca986
Pass around bluesky objects instead of restructuring them everywhere
...
Makes thing a little simpler in many places, especially once I start adding more fields to posts
2023-10-15 11:55:25 +02:00
Aleksei Voronov
5eeb0e45b1
Restructure Bluesky-related code a bit
...
- Put internal stuff (cbor, ipld deserialization, xprc client) into internals module
- Move various record types into separate modules under entities
- Also move session into entities as well
- Simplify CBOR conversion stuff by liberal usage of TryFrom
This will all make it a little easier to implement additional things, like filtering out replies
2023-10-15 11:47:56 +02:00
Aleksei Voronov
f008057f8a
Upgrade dependencies because why not
2023-10-15 11:47:56 +02:00
Aleksei Voronov
c498f9edd3
Create rust.yml
2023-10-09 16:13:25 +02:00
Aleksei Voronov
25373561b4
Update README.md
...
Add some details about dependencies used
2023-10-09 09:31:26 +02:00
Aleksei Voronov
6608071ef5
Update README.md
...
Add more details about setup, remove the unnecessary roadmap, add links to production deployment.
2023-10-09 09:17:36 +02:00
Aleksei Voronov
f5e1d3e020
Formatting
2023-10-07 18:54:43 +02:00
Aleksei Voronov
768bb9f175
The feed parameter is an at-uri, not the feed name
2023-10-07 18:54:27 +02:00
Aleksei Voronov
6f8c86d815
Switch cursor-related messages to debug level
...
Otherwise it's too noisy
2023-10-07 18:27:25 +02:00
Aleksei Voronov
1bd843a05a
Fix publishing feeds
...
This basically required implementing authentication from ground up
because atrium-api is horribly deficient when it comes to it,
providing basically no real way to manage it, and what is provided
is actually broken anyway requiring additional hacks to get around
Ah well. This has been the story of using anything in Rust that's
related to Bluesky. Everything is broken.
2023-10-07 18:26:20 +02:00
Aleksei Voronov
1e0e34b9a5
Fix the name of the service endpoint field in did.json
2023-10-07 18:24:04 +02:00
Aleksei Voronov
1d17c8b637
Make it able to run in production
...
- Remove the build Dockerfile, it's not useful on cheap VMs because you can't really build anything on them
- Update the serving address to be 0.0.0.0 so that it's actually exposed externally (127.0.0.1 isn't)
- Also update the port to be 3030 for no reason at all
- Add a Cross.toml config file for cross-compilation since my machine isn't exactly Linux that the resulting binary needs to run on
2023-10-06 20:24:58 +02:00
Aleksei Voronov
dec35a867c
Fix SQL syntax
...
Good God, Aleksei.
2023-10-05 20:38:22 +02:00
Aleksei Voronov
2062f0bb89
Add a nice optimized dockerfile for deployment
2023-10-05 20:20:06 +02:00
Aleksei Voronov
70f9733112
Upgrade dependencies because why not
2023-10-02 17:26:21 +02:00
Aleksei Voronov
2df16725bc
Update README with some up-to-date info
2023-10-02 17:23:22 +02:00
Aleksei Voronov
2fca5497e6
Allow forcing country for multiple profiles
...
In case it's necessary
2023-10-02 17:10:14 +02:00
Aleksei Voronov
883d02e328
Extract and print out the time of the commit
...
Useful for visibility for when we inevitably fall behind in processing
2023-10-02 16:59:31 +02:00
Aleksei Voronov
6bc2dc2a42
Simplify states a little in the server
...
Use the sub-states directly, as suggested by axum's docs
2023-10-02 16:44:03 +02:00
Aleksei Voronov
4a08a283d2
Properly handle errors in post indexer and profile classifier
...
Reconnect to Bluesky in the indexer
Don't exit the classifier just because we couldn't fetch profiles
2023-10-02 16:34:22 +02:00
Aleksei Voronov
1ac405e5ee
Add a way to manually mark a certain profile as being from a specific country
2023-09-27 13:22:26 +02:00
Aleksei Voronov
db8a85624f
Formatting
2023-09-27 12:45:08 +02:00
Aleksei Voronov
96480b6fb9
Add proper error handling to the web service
...
Return 500 when shit happens.
Return 404 when the feed is not found
2023-09-25 12:49:01 +02:00
Aleksei Voronov
fadf882a1f
Formatting and clippy
2023-09-24 20:27:51 +02:00
Aleksei Voronov
0cd3202a9c
Don't error out on profiles that don't exist anymore
...
Nice little match there, sigh.
Closes #1
2023-09-24 20:26:34 +02:00
Aleksei Voronov
642a3d57cc
Remove ciborium in favor of custom deserialization logic
...
Unfortunately, looks like serde is not flexible enough to support everything CBOR does,
so a lot of messages cannot be deserialized properly. Other serde-based CBOR libraries
suffer from the same problem.
So now we have a bunch of boring deserialization logic supported by sk-cbor
2023-09-24 20:06:20 +02:00
Aleksei Voronov
ffccdc40fe
Update the roadmap a little bit to mention everything that's needed to get to v1
2023-09-23 20:42:35 +02:00
Aleksei Voronov
2268f9ca14
Limit language detector to only use cyrillic script
...
The makes the memory consumption like 100Mb, which is much more reasonable than 1Gb that
it was using previously
2023-09-23 20:39:51 +02:00
Aleksei Voronov
658996d5d5
Delete posts from the database when they are deleted from bluesky
2023-09-23 20:29:56 +02:00
Aleksei Voronov
dd33333649
Rewrite streaming processing in a more sane way
...
And also add support for likes and follows
2023-09-23 20:25:26 +02:00
Aleksei Voronov
3a54e04bf4
Upgrade atrium-api dependency
2023-09-22 18:21:08 +02:00
Aleksei Voronov
aa17ece012
Fix clippy lints
...
Nothing major here tbh
2023-09-22 17:15:48 +02:00
Aleksei Voronov
83bede52ce
Remove dead code
2023-09-22 17:12:49 +02:00
Aleksei Voronov
e95c4923d6
Add some untested version of publishing a feed
...
Also adjust names of different env vars, and also adjust setup instructions
2023-09-22 13:33:13 +02:00
Aleksei Voronov
5128bf9d4a
Refactor streaming stuff
...
Now we call the processor once per commit, and it's also now
a commit processor, not an operation processor, so that we can
update the cursor properly
2023-09-22 12:37:10 +02:00
Aleksei Voronov
08dc55b2cd
Rejiggle things a bit to make it possible to have multiple binaries here for publishing
2023-09-21 15:01:43 +02:00
Aleksei Voronov
901c4b6e97
Make Algo.should_index_post a fallible async function, for maximum extensibility
...
We may want to perform some more complicated operations here in the future
2023-09-21 13:31:27 +02:00
Aleksei Voronov
c02bded6f8
Formatting
2023-09-21 13:28:22 +02:00
Aleksei Voronov
2fd1474647
Don't crash when unable to classify a profile due to some random problem
...
Random problems include: deleted profiles.
Also always wait 10 seconds between runs, we don't need to do it so often
2023-09-21 13:25:36 +02:00
Aleksei Voronov
93c4979c71
Keep subscription state in order to not lose messages
...
This isn't a good way to do it though, because opreations processor is only called for each operation,
so we end up not updating the cursor as often as we realistically should be.
I'll refactor this slightly later
2023-09-21 12:33:17 +02:00
Aleksei Voronov
62b00ceed7
Replace random print statements with proper logging setup
2023-09-21 11:22:18 +02:00
Aleksei Voronov
f4ee482ce7
Use Arcs to pass stuff around to avoid dealing with lifetimes
...
And also implement proper language detection through lingua-rs,
because Bluesky's detection is really bad
2023-09-21 10:36:47 +02:00