Internet mostly works in Spain when there is a match: one can see traffic figures from the mayor exchange points: they are unaffected.
Big businesses are unaffected, since LaLiga will quickly reverse any block that impacts popular websites and risks triggering significant public outcry.
Most people in Spain don’t care — and many aren’t even aware of the overly broad blocks.
Cloudflare and RootedCON are challenging this in court, but it may take many years before a final outcome is reached.
There are many things that are expensive that are nevertheless not particularly seen as “status symbols”, in the sense of commonly used to publicly display one’s status/wealth/whatever.
I replaced my old iPhone XR with a brand new 16 this year, not because anything was wrong with it (even the battery was OK), but I wanted to see what the changes brought.
I was quite surprised that other than the much better battery, USB-C, and much better camera, and sometimes faster speed, the old one was holding up quite well.
You can get an old iPhone XR for 100 EURish, in decent condition. I really have no idea what model year iPhone's others have.
Americans don't earn the minimum wage. You're talking about less than 1/2 of 1% of the working population. It's a nearly worthless metric (other than as a political reference to how long it has been since the minimum wage has been increased and how far behind the median it is).
it’s just another of the many many comments in this thread where people throw out statistics to make a point, but those statistics are typically detached from reality or not even focused on the main topic of the conversation.
We and many others use the same techniques too. This is a concept introduced 20 years ago, and RIPE Atlas/IPmap has been a public implementation of this idea for the last decade or so.
(However, the vast majority of IPs can't be geolocated in this way, and there are caveats to those that can be.)
In any case, the difficulty with all providers in this space is how you prove accuracy at scale. If we assume some provider has some proprietary technique that nets 100% accuracy, that's great, but what do you compare it to? There is no ground truth data source - we are supposed to be that.
Marketing plays a big role, and admittedly, these guys have much better marketing on this point :)
I have no doubt that other providers use latency from probes as a data point. But IPinfo allows way more weight from this in their calculations, probably because they have developed their own reliable network, unlike most competitors.
Your service, like many others, accepts as valid most intentionally fake geolocation data provided by networks. I am sure you know this anyway, so no need to mislead saying "we do the same".
This IP is actually in Ontario, which you can easily verify with a ping measurement. But it is announced as being in Calgary by Apple's iCloud Relay geofeed (https://mask-api.icloud.com/egress-ip-ranges.csv).
Why? Because the intent of iCloud Relay is to obscure a user's IP address while still providing a roughly accurate location, specifically so that geolocation-based services still work as expected. For that to work, they need to provide 'fake data' in this geofeed so that they have pools of addresses covering thousands of cities around the world, AND they need geolocation providers to accept this.
ipinfo accepts this, even though it's wrong. So do we. After all, geofeeds were supposed to provide a public geolocation database, the idea being that the network operator should be trusted to have the best information. We could provide the 'real' location, but if 9 providers say an IP address is in X and 1 provider says it's in Y, and Y is correct, you may just be frustrating end users of the network.
But where is the line? I'm not sure, and it's hard to say who has the balance right here.
We try to mitigate this by providing extra data like whether the address is a hosting or relay provider - for free, unlike others :) Some future addition could be to provide additional accuracy or source information, or even a 'reported' vs 'measured' location.
We're working through this and hope to get to the right answer over time. Thanks for raising this. :)
Thank you very much, Nyr. We really appreciate the trust you have placed in us.
We are constantly expanding our probe network, and my colleagues are also working on stabilizing and improving the network.
From 2025 and beyond, we will be putting a lot of effort into our R&D program. Our probe network provides the data that helps our data and research team to create better models.
Currently, a good portion of the data is available to all with no compromise. When we do make mistakes, we hope our community of users will point them out to us. Each ASN or IP address mistake generates a new ticket, our CEO/Founder is tagged there, our data team investigates it, pushes fixes, and provides explanations.
Nice to meet you. I love the work you have done so far for IPlocate.io. Keep up the good work. The point you have raised is interesting. However, I am not sure about the "Marketing plays a big role" point you made.
We are a developer-first company.
- Developers need a good product, so we are always committed to providing one, whether it is free or not. The best-in-class product we offer is a result of our obsession with meeting the expectations of our developer community. Our relationship with our developers largely (but not entirely) relies on the quality of our product. And that is not just data (which is again just miles ahead of everyone else). It is integrations, infrastructure, site reliability, uptime, dashboards, tools, CLI, and much more.
- For developers, if they cannot get the best product, you have to tell them why you are not the best. So, we write long community posts and have deep technical discussions. We built the community forum, just for our developers to have conversations directly with us.
- For developers, they need someone to be present and respond, and we are always there. In our community someone from our technical team is available seven days a week.
- For developers, they enjoy a product outside of work, so we will hold huntathon, online games, technical articles and events for them.
Being developer-first comes with revenue perks, sure. Our founder started the company 12 years ago, and it has been slow and stable growth. We have built trust in the community, and today it is impossible for us to find any developers who haven't heard of us. That is not because we have a billboard or spent X amount on paid marketing. They heard of us because they used our product, they like our product, and they know our team.
IPInfo appears to have good data, but their main advantage is strong marketing. They've effectively built a community of fanboys by publicly sharing techniques and methods that other major providers have been using for years or even decades.
I would not say we have a super strong marketing effort, but we genuinely care about developers. That is how the company started, and that is how the company is operating after 12 years.
We are super obsessed about any points of friction any IPinfo user has, but to be honest, that is just what any developer-first business should do. We are obsessed about developer sentiment and perception, but that is what the industry should be. Our users consist of the smartest people I know, and because we go above and beyond to be helpful to them, they do not point out mistakes we make, they actually try to help us.
Consider our probe network server finding. We genuinely hit a wall after 750 servers. Then we reached out to our community of users, who found 150 more servers. There are even developers who will talk to their local hosting providers in their local language just to get us a server.
Then writing code to integrate our data into different places. Our team is extremely small, so we cannot actively contribute engineering contributions to open source projects. Our users usually help us a lot in writing high-quality code for open source projects they already use.
We are humbled because of the help our user share, and that is why we are obsessed with them and trying our best to go above and beyond to help them.
Governments do not even need any of the providers to comply, they can access global NetFlow data. This is conveniently not discussed by any commercial VPN provider.
It ultimately depends on your threat model. But assuming a state actor has access to NetFlow data, an attack could work like this:
* State actor determines that an IP belonging to a VPN company had a session on example.com around t1-t2
* You -> VPN server at t1
* VPN server -> example.com at t1+latency
* More traces from both sides until around t2 as you browse the site
By correlating multiple samples, and accounting for latency between you and the VPN server and delay introduced by the VPN itself, they would be able to get decent confidence that it was you.
Basically when you go at the point of state threat actors. Things get real spooky.
The censorship , the what not.
I feel sad that we have given governments such major accesses in the name of unification.
We need more decentralization at the political level & economical level as well (like most money goes to your city , then state , then at the country , very nominal amount)
Let city decide what it wants with major town hall discussions.
The threat actor most use to talk about this is a global passive adversary: a threat actor who can see all relevant traffic on the Internet but who can't decrypt or adjust the traffic.
This adversary would have the ability to ingest massive amounts of data and metadata[0] it acquires from tier 1 ISPs all over the country[1] and the world[2]. They'll not see raw HTTP traffic because most everything of interest is encrypted, but can store and capture (time, srcip, srcport, dstip, dstport, bytes).
From there, it's a statistical attack: user A sent 700 kilobytes to a VPN service at time t; at t+epsilon the VPN connected to bad site B and sent 700 kilobytes+epsilon packets. Capture enough packet flows that span the user, the VPN, and the bad site and you can build statistical confidence that user A is interacting with bad site B, even with the presence of a VPN.
This could go other directions too. If bad site B is a Tor hidden site whose admin gets captured by the FBI and turns over access, they'll be unmasking in reverse – I got packets from Tor relay A, which relay sent packets at time-epsilon to it, (...), to the source.
There's very little you can do to fight this kind of adversary. Adding hops and layers (VPN + VPN, Tor, Tor + VPN, etc.) can only make it harder. It's certainly an expensive attack both in terms of time consumption, storage, and it requires massive amounts of data, but if your threat model includes a global passive adversary, game over.
I'm bearish on introducing noise[0] to resist traffic analysis, and I'm exceptionally bearish when the only layer managing noise injection is "a for-profit entity that can be legally compelled to do things"
But every layer helps; I'd feel more than happy torrenting over Mullvad alone, and I'd definitely use it as an additional layer of defense with other tools to keep me private if my threat model needed to consider stronger risks.
Synchronous packet transfer only solves the problem if you build a truly constant rate network. Traffic monitoring works when variances exist; your flow has to be fully homogeneous to provably secure against it. That means in your model your users would need to transmit and receive exactly 96kbps at all times when on net, and your nodes would talk to each other at 1024kbps at all times when on net. Otherwise, consider A->onion1->onion2->B – an attacker could potentially see the flow from onion1->onion2 decrease to 1 PPS sec when A isn't talking, and increase when A is.
Truly constant rate anonymity networks dramatically add resistance to passive traffic analysis, but they move users from a low-latency/high-throughput network to 56k dialup speeds :) Not only does this suck so most people won't use it, but the people who do chose to use it will glow neon bright to adversaries. The use of the system will be a strong indicator that, even if you don't know what the user is doing, the user is doing _something_ interesting.
And even if there was desire, these networks are intrinsically limited in size and scale if they want to maintain constant rate. Herbivore[0] is an interesting proposal in this space - use a DC-net partitioned into smaller cliques to give in-group anonymity but mass participation. And most use chaff packets – A has nothing to send so sends encrypted random data to maintain the constant rate guarantee... I'm trying to find the paper I read that suggests a global passive adversary who goes "hands on" in the network could use a combination of watermarks generated through packet dropping/artificial queues + knowledge of which packets are chaff to build a trace, but I'm struggling. If I do I'll drop it here.
Could you protect against NetFlow analysis by pushing a bunch of noise over the VPN tunnel at all times? I'd assume it would at least make the analysis significantly more challenging.
Some of the prior works in this paper[0] address noise in anonymity networks, but in general: you either add noise at the link level which malicious nodes can identify & ignore, or you add noise by injecting fake chaff packets that are dropped somewhere inside the network which are statistically identified when you look at packet density across the network.
This might or might not extend to VPN nodes depending on your threat model - I'd personally assume every single node offered to me by a company in exchange for money is malicious if I was concerned about privacy.
I think we should have substantially more confidence in the claims of people who A) haven’t been caught misleading us yet and B) have published extensive code and weights for their absolutely cutting edge stuff and C) aren’t attached to a bunch of other bad behavior (e.g. DDoS crawlers) that we know about.
If there’s news of DeepSeek behaving badly and I missed it, then I take that back, but AFAIK they are at or near the top of the rankings on being good actors.
Why is this doubtful, did you spot any suspicious things in their paper? They make the weights and a lot of training details open as well, which leaves much less room for making stuff up, e.g. you could check training compute requirements from active weight size (which they can't fake as they released the weights) and fp8 training used.
There is rumor the open source is diff from the hosted deepseek so needs more investigation. A bad actor would be someone piping oai models behind a server
It's not actually a 600B+ model. It's a mixture of experts. The actual models are pretty small and thus don't require as much training to reach a decent point.
It's similar to Mixtral having gotten good performance while not having anywhere near OpenAI class money / compute.
Yandex also has a pretty extensive cache, although recently they seem to have disabled caching for reddit. Otherwise it is good for finding deleted stuff, I've seen cached pages go as far back as a couple of years for some smaller/deleted websites.
I only ever used cache to find what google thought was in the site (at the time of crawling) as these days it is common to not find that info in the updated page. For everything else, there is the Internet Archive.
Big businesses are unaffected, since LaLiga will quickly reverse any block that impacts popular websites and risks triggering significant public outcry.
Most people in Spain don’t care — and many aren’t even aware of the overly broad blocks.
Cloudflare and RootedCON are challenging this in court, but it may take many years before a final outcome is reached.