Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excessive bandwidth use #3429

Closed
wrouesnel opened this issue Nov 27, 2016 · 44 comments
Closed

Excessive bandwidth use #3429

wrouesnel opened this issue Nov 27, 2016 · 44 comments
Labels
need/analysis Needs further analysis before proceeding topic/perf Performance

Comments

@wrouesnel
Copy link

Just had to kill the persistent ipfs-node I've been running on my home fileserver for the last 2 weeks due to excessive uploads (without any apparent fileserving activity taking place). The process was sitting at around 1 mbps of uploads constantly (judging from my router's bandwidth monitor), which on a home DSL connection is a huge chunk of upload capacity to be taking over.

This was running the docker image SHA: sha256:f9d41131894a178f2e57ca3db8ea6f098338f636cb0631231ffecf5fecdc2569

I do note that my logs have a fair few messages like the following, but they don't seem particularly informative:

Nov 26 05:16:04 monolith dockerd[4000]: 18:16:04.545 ERROR        dht: no addresses on peer being sent!
Nov 26 05:16:04 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 26 05:16:04 monolith dockerd[4000]:                                         [sending:<peer.ID ZRfiDX>]
Nov 26 05:16:04 monolith dockerd[4000]:                                         [remote:<peer.ID SoLV4B>] handlers.go:75
Nov 26 13:50:26 monolith dockerd[4000]: 02:50:26.172 ERROR        dht: no addresses on peer being sent!
Nov 26 13:50:26 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 26 13:50:26 monolith dockerd[4000]:                                         [sending:<peer.ID VADT5H>]
Nov 26 13:50:26 monolith dockerd[4000]:                                         [remote:<peer.ID SoLju6>] handlers.go:75
Nov 26 15:05:24 monolith dockerd[4000]: 04:05:24.041 ERROR        dht: no addresses on peer being sent!
Nov 26 15:05:24 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 26 15:05:24 monolith dockerd[4000]:                                         [sending:<peer.ID UQTebb>]
Nov 26 15:05:24 monolith dockerd[4000]:                                         [remote:<peer.ID SoLMeW>] handlers.go:75
Nov 26 15:05:24 monolith dockerd[4000]: 04:05:24.102 ERROR        dht: no addresses on peer being sent!
Nov 26 15:05:24 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 26 15:05:24 monolith dockerd[4000]:                                         [sending:<peer.ID UQTebb>]
Nov 26 15:05:24 monolith dockerd[4000]:                                         [remote:<peer.ID SoLer2>] handlers.go:75
Nov 26 15:46:06 monolith dockerd[4000]: 04:46:06.640 ERROR        dht: no addresses on peer being sent!
Nov 26 15:46:06 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 26 15:46:06 monolith dockerd[4000]:                                         [sending:<peer.ID Tq6zSp>]
Nov 26 15:46:06 monolith dockerd[4000]:                                         [remote:<peer.ID SoLueR>] handlers.go:75
Nov 26 15:46:07 monolith dockerd[4000]: 04:46:07.691 ERROR        dht: no addresses on peer being sent!
Nov 26 15:46:07 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 26 15:46:07 monolith dockerd[4000]:                                         [sending:<peer.ID Tq6zSp>]
Nov 26 15:46:07 monolith dockerd[4000]:                                         [remote:<peer.ID SoLueR>] handlers.go:75
Nov 26 15:46:09 monolith dockerd[4000]: 04:46:09.032 ERROR        dht: no addresses on peer being sent!
Nov 26 15:46:09 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 26 15:46:09 monolith dockerd[4000]:                                         [sending:<peer.ID Tq6zSp>]
Nov 26 15:46:09 monolith dockerd[4000]:                                         [remote:<peer.ID SoLju6>] handlers.go:75
Nov 26 22:16:31 monolith dockerd[4000]: 11:16:31.723 ERROR        dht: no addresses on peer being sent!
Nov 26 22:16:31 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 26 22:16:31 monolith dockerd[4000]:                                         [sending:<peer.ID VaByF3>]
Nov 26 22:16:31 monolith dockerd[4000]:                                         [remote:<peer.ID SoLMeW>] handlers.go:75
Nov 26 22:16:32 monolith dockerd[4000]: 11:16:32.774 ERROR        dht: no addresses on peer being sent!
Nov 26 22:16:32 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 26 22:16:32 monolith dockerd[4000]:                                         [sending:<peer.ID VaByF3>]
Nov 26 22:16:32 monolith dockerd[4000]:                                         [remote:<peer.ID SoLju6>] handlers.go:75
Nov 26 22:16:33 monolith dockerd[4000]: 11:16:33.611 ERROR        dht: no addresses on peer being sent!
Nov 26 22:16:33 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 26 22:16:33 monolith dockerd[4000]:                                         [sending:<peer.ID VaByF3>]
Nov 26 22:16:33 monolith dockerd[4000]:                                         [remote:<peer.ID SoLMeW>] handlers.go:75
Nov 27 00:41:10 monolith dockerd[4000]: 13:41:10.298 ERROR        dht: no addresses on peer being sent!
Nov 27 00:41:10 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 27 00:41:10 monolith dockerd[4000]:                                         [sending:<peer.ID PuZXvJ>]
Nov 27 00:41:10 monolith dockerd[4000]:                                         [remote:<peer.ID SoLju6>] handlers.go:75
Nov 27 18:42:37 monolith dockerd[4000]: 07:42:37.359 ERROR        dht: no addresses on peer being sent!
Nov 27 18:42:37 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 27 18:42:37 monolith dockerd[4000]:                                         [sending:<peer.ID acR5hF>]
Nov 27 18:42:37 monolith dockerd[4000]:                                         [remote:<peer.ID SoLSaf>] handlers.go:75
Nov 27 18:42:37 monolith dockerd[4000]: 07:42:37.511 ERROR        dht: no addresses on peer being sent!
Nov 27 18:42:37 monolith dockerd[4000]:                                         [local:<peer.ID c7TGoJ>]
Nov 27 18:42:37 monolith dockerd[4000]:                                         [sending:<peer.ID acR5hF>]
Nov 27 18:42:37 monolith dockerd[4000]:                                         [remote:<peer.ID SoLSaf>] handlers.go:75
@whyrusleeping
Copy link
Member

@wrouesnel Try running your daemon in dht client mode with ipfs daemon --routing=dhtclient. That should help resolve at least some of the excess bandwidth usage

@whyrusleeping whyrusleeping added the need/analysis Needs further analysis before proceeding label Nov 28, 2016
@wrouesnel
Copy link
Author

This didn't appreciably help - launching with the option still pegged about 900kbps of constant upstream usage which is still way too much to keep a node consistently alive - which interferes heavily with the ability to decentralize to home users or mobile devices (i.e. use distributed IPFS services on a limited connection).

@whyrusleeping
Copy link
Member

@wrouesnel thats very odd... Could you check ipfs stats bw with the --proto flag set to some of /ipfs/bitswap/1.1.0 /ipfs/dht /ipfs/bitswap /ipfs/bitswap/1.0.0 /ipfs/kad/1.0.0 and figure out which protocol is taking up all that bandwidth?

@wrouesnel
Copy link
Author

$ for p in /ipfs/bitswap/1.1.0 /ipfs/dht /ipfs/bitswap /ipfs/bitswap/1.0.0 /ipfs/kad/1.0.0 ; do echo ipfs stats bw --proto $p && ipfs stats bw --proto $p  && echo "---" ; done
ipfs stats bw --proto /ipfs/bitswap/1.1.0
Bandwidth
TotalIn: 0 B
TotalOut: 0 B
RateIn: 0 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/dht
Bandwidth
TotalIn: 1.8 MB
TotalOut: 1.6 MB
RateIn: 115 B/s
RateOut: 767 B/s
---
ipfs stats bw --proto /ipfs/bitswap
Bandwidth
TotalIn: 5.7 MB
TotalOut: 0 B
RateIn: 146 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/bitswap/1.0.0
Bandwidth
TotalIn: 0 B
TotalOut: 0 B
RateIn: 0 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/kad/1.0.0
Bandwidth
TotalIn: 0 B
TotalOut: 0 B
RateIn: 0 B/s
RateOut: 0 B/s
---

It's actually looking a lot better now for some reason (with --routing=dhtclient) but I only just started the server process.

@wrouesnel
Copy link
Author

So far so good, although downloading 2GB cumulative over the course of 6 hours for no actual upstream usage I would still argue isn't great behavior.

CONTAINER           CPU %               MEM USAGE / LIMIT      MEM %               NET I/O               BLOCK I/O           PIDS
ipfs-node           7.03%               314.1 MiB / 31.4 GiB   0.98%               1.438 GB / 703.9 MB   54.34 MB / 0 B      0

@whyrusleeping
Copy link
Member

@wrouesnel hrm... yeah. Thats definitely not ideal. I'll comment back later with some things to try to help diagnose the slowness

@wrouesnel
Copy link
Author

wrouesnel commented Jan 11, 2017

An update:

Looking at bandwidth graphs on my router its averaging around 500kbps of traffic, spiking up to 1mbit though. This almost immediately flatlines after I kill the ipfs-node container - but this would seem to be way more usage then is being reported by the ipfs stat command.

So there's definitely way too much traffic going on, and it doesn't look like it's being accounted for properly - tying up 50% of my DSL upstream for an idle node permanently just isn't practical.

for p in /ipfs/bitswap/1.1.0 /ipfs/dht /ipfs/bitswap /ipfs/bitswap/1.0.0 /ipfs/kad/1.0.0 ; do echo ipfs stats bw --proto $p && ipfs stats bw --proto $p  && echo "---" ; done

ipfs stats bw --proto /ipfs/bitswap/1.1.0
Bandwidth
TotalIn: 0 B
TotalOut: 0 B
RateIn: 0 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/dht
Bandwidth
TotalIn: 1.6 GB
TotalOut: 2.8 GB
RateIn: 2.9 kB/s
RateOut: 7.1 kB/s
---
ipfs stats bw --proto /ipfs/bitswap
Bandwidth
TotalIn: 7.3 GB
TotalOut: 47 MB
RateIn: 2.0 kB/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/bitswap/1.0.0
Bandwidth
TotalIn: 0 B
TotalOut: 0 B
RateIn: 0 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/kad/1.0.0
Bandwidth
TotalIn: 0 B
TotalOut: 0 B
RateIn: 0 B/s
RateOut: 0 B/s
---

This is after running for about a day or 2.

@wrouesnel wrouesnel changed the title Excessive bandwidth use on DSL connections Excessive bandwidth use Jan 11, 2017
@slothbag
Copy link

slothbag commented Feb 2, 2017

I have been struggling with excessive IPFS bandwidth usage for over a year now.. #2489

I believe this is due to IPFS having an unrestricted number of peers, often you will see many hundreds of concurrent connections all consuming 5-10KB each. I have tried to manually limit the number of peer connections using iptables but this effectively severs the node from the network as the limit is quickly reached and then no one else can connect.

This combined with the fact that peers do not relay blocks means that any unique content on the restricted peer cannot be accessed.

I did have some luck throttling the ipfs daemon #1482, although it does make accessing the content extremely slow.

@whyrusleeping
Copy link
Member

@slothbag Youre right for the most part, Having tons of peer connections is a cause of excessive bandwidth usage.
We've been fighting bandwidth usage as best as we can within the current structures, and it does help (0.4.5 rc1 uses much less bandwidth than 0.4.4, and 0.4.4 uses much less than 0.4.2), but the network has also been growing larger and larger at the same time, making our efforts not super visible.

One of the biggest issues right now is that it turns out DHTs don't scale super well, we're looking at and thinking deeply about solutions to this problem: ipfs/notes#162

The next part of the problem, as you suggest, is that we keep too many open connections. Which, in an ideal world, wouldnt necessarily mean that bandwidth usage increases, but since bitswap and the dht both send lots of chatter to all the peers they are connected to (we can fix this, it just needs to be thought through), it results in a pretty direct correlation between number of peers and bandwidth usage. We've been thinking also about connection closing, its also a hard problem (have to keep track of which peers to keep bitswap sessions open to, have to manage dht routing tables and peer search connections).

Until we get to working out these issues (its very high on our priority list), updating to 0.4.5-rc1 should yield an improvement (the more people who update to 0.4.5 the more bandwidth savings everyone gets).

@slothbag again, thanks for sticking with us and continuing to pester us about this, it really helps.

@slothbag
Copy link

slothbag commented Feb 3, 2017

Thanks @whyrusleeping , I wasn't aware of those notes. Interesting that you mention DHT as not being scalable as this affects just about all P2P applications that aspire to "take over the world" so to speak.

I'll add some of my thoughts to your notes.

@whyrusleeping
Copy link
Member

@slothbag Thanks! The more thought we put into solving this the better :) Its a relatively unsolved problem in general. The bittorrent DHT gets away with it because theyre nowhere near the scale that ipfs pushes as far as dht_records/node (you might have maybe 100 torrents per peer on mainline where in ipfs you'll have tens to hundreds of thousands to millions of blocks you need to announce).

@btc
Copy link
Contributor

btc commented Apr 25, 2017

where in ipfs you'll have tens to hundreds of thousands to millions of blocks you need to announce

Is this the primary challenge?

(Writing from an uninformed perspective and curious about the scalability challenges the system faces.)

@whyrusleeping
Copy link
Member

@btc Yeah, from the content routing (DHT) perspective, managing providers is the biggest issue in my opinion.

@ingokeck
Copy link

Just ran into that problem myself. Got our azure budget maxed out because of IPFS sending out around 500GB per month... reported bandwith usage by ipfs is way off, I measure 120kB/s in and 80kB/s out via iptraf, while ipfs reports 8kB/s in and 9kB/s out via ipfs stats bw.

@voidzero
Copy link

Any update on this? My primary reason for not using ipfs yet is that the bandwidth is just unrestricted. I could, but I don't want to do it from outside using traffic shaping or whatever. I want to be able to configure maximum bandwidth in ipfs itself. Last time I tried it, my bandwidth use was extremely high as well.

@ingokeck
Copy link

@voidzero Any idea how to restrict it in an intelligent way from the outside? I do not want to restrict the actual download of blocks, only the background information exchange to sensible amounts.

@voidzero
Copy link

No, wish I did @ingokeck, sorry.

@whyrusleeping
Copy link
Member

@voidzero @ingokeck we're working on it, some recent things to try out (on latest master) are:

  • running the daemon as ipfs daemon --routing=dhtclient.
    • This makes your node not serve DHT requests to other peers
  • changing your reprovider strategy to "pinned" or "roots"
    • "pinned" causes your nodes to only reprovide objects that youve pinned, as opposed to the default of every local block. "roots" only provides the pins themselves, and not and of their child blocks. "roots" has a much lower bandwidth cost, but may harm the reachability of content stored by your node.
    • ipfs config Reprovider.Strategy pinned

In general, with each update we've had improvements that reduce bandwidth consumption. Coming soon we have a connection closing module that should help significantly.

@EternityForest
Copy link

EternityForest commented Dec 24, 2017

Allright, I'm not an IPFS dev, but I think the idea has a lot of potential, and I've done some research on various ways of improving performance(Some of these are not my ideas and are just things I've found in various discussions) and here's what I think:

I'm a little confused about how bitswap works. Do you send wantlists to all peers? And why connect to hundreds of nodes in the first place? Why not only connect to nodes that have the files you want, or that want files from you?

Also, what about going one step further from the roots pinning, and letting provider records have a flag that says you also have all child blocks?

There are many files that are unlikely to be downloaded individually. Allowing people to provide entire folders with only one record would make a big difference in some cases.

Imagine a million people suddenly use an IPFS based file sync solution. Nobody except a few people care about each file, yet there's 20 records for every one scattered all about the swarm, but any node that's interested likely has a complete copy, so we can replace billions of records with possibly 10000x fewer root of folder only records.

Also, it would incentivise referring to files as hash/path/etc instead of linking directly to the hash. Using full paths preserves historical info about the origin of the file, while still allowing anyone who wants to to pin the file individually if they have a reason to.

You'd probably need to automatically pin the full chain of blocks when pinning something by it's path in this mode for this to be most effective but that's simple enough.

To allow linking to an existing block that everyone customarily references by full path without losing the path, symlinks could be added, so that a block could reference another block by path instead of directly.

Another idea is to stop providing everything you download 20 times. If you find that there are already 10 nodes pinning something, then maybe you only need to add 1 or 2 copies of the provider record. There's

And most importantly of all, I think transparent caching proxies could solve a lot of these issues. If all functions including adding provider records could go through a proxy, most users wouldn't have to worry about it, and most traffic could eventually be within data centers of colocated proxies, with old style DHT crawling as a fallback.

If you tell a massive data center to pin something, and someone else using the same proxy downloads it, that process can be as efficient as centralized stuff, because proxies can cache everything.

The company could also decide not to add provider records for all of the millions of files that clients use, and instead only have records saying "X big company can get get this file", basically aggregating thousand of provider records into one, possibly passing a few through directly for reliability.

It would also allow a company to run a centralized server to make private data transfers faster, without having to rely on that server.

Also, it would allow for the same kind of functionality as the various caching acceleration systems that browsers use, in a a standard way.

You could define multiple levels of what a proxy would do for who, all the way up to actually storing pinned files. Now there's a standard protocol for pinning services, and any node can be a pinning service for any other node(Handy if IPFS gets built into a router and you're on mobile).

Proxies could cache the actual data, meaning in theory there should be no performance hit vs using centralized services, because it essentialy is centralized, right up until the server goes down.

Maybe IPNS names could associate a default proxy with a name, so as to say "This entire IPNS site uses this proxy as a tracker, unless it goes down then use the DHT". The tracker would still get some assistance from the swarm for large files, but you wouldn't need to do DHT lookups at all so long as the tracker was up and not overloaded.

Heavy reliance on proxies adds a bit of centralization, but it's seamless. If a proxy goes down it could take some cached stuff and provider records with it, but they'd be back soon enough as nodes noticed that the proxy was down.

And the potential gain in some cases could be hundreds of times fewer provider records (Via the aggregation mechanism) and near-centralized latency for some popular content.

@whyrusleeping
Copy link
Member

Hey @EternityForest, good notes. 'recursive' or 'aggregate' provider records is something we've been thinking about, as well as some form of delegation in the routing system (proxies, as you call it). Discussion on that is over in this thread: ipfs/notes#162

As for bitswap, The 'dumb' default mode is to broadcast your wantlist to your connected peers optimistically, to wait until you find provider records for an object would add annoying amounts of latency to the requests. We have a newer API for bitswap called 'bitswap sessions' that only does that broadcast for the first node in a request, and then reuses peers its gotten data from for future requests within the context of that request. You can read more about that here: #3786

Another optimization that will help bitswap significantly is 'ipld selectors', and we have an issue discussing that here: ipfs/notes#272

As for proxies, thats a very interesting idea that has lots of different approaches. For that you really need to consider different trust scenarios, and maybe even have some sort of reputation system, otherwise you might end up getting fed bad data or censored.

My apologies for not responding to every point, trying to take some time away from the computer, but i saw this comment come in and felt compelled to respond :)

@EternityForest
Copy link

Thanks for getting back to me! It's cool to see how active this community is. I'll probably take some time to look at those threads after the holidays.

I'm heading out for a bit in a minute or two, but one idea for proxy trust is just to manually configure it, and use a "web of trust" type model.

Maybe you manually set the trust level of google.com to 1, and they "suggest" two other servers, which they claim are as trustworthy as them. So you trust them half as much, because they're one hop away. Maybe you also trust example.com, and they recommend those same servers, so you trust them a little more now that you have 2 good recommendations.

@EternityForest
Copy link

EternityForest commented Jan 8, 2018

More random ideas!

What if we had "implicit" provider records created when you make a DHT request? So that crawling the DHT leaves a "trail" of logged requests for others to find you.

If someone else wants that file, the node says "This guy probably found it by now", but you never had to create any explicit records.

"Real" provider records could be reserved for pinned data, and instead of optimistically asking every peer to reduce latency, we could simply start crawling the DHT, and if the content is popular, we'll find it pretty fast, if the content isn't popular, we might not have found it anyway without a DHT crawl.

@voidzero
Copy link

voidzero commented Jan 8, 2018

I don't understand the ins and outs and while I appreciate the enthusiasm, here's the thing: I just want to set a global maximum amount of bandwidth. This is possible with so many clients, from bittorrent clients, to Tor, and to browser extensions. Why can't I just ensure that ipfs is allowed to use a maximum of (for example) 150KB/s in/egress? It doesn't have to be this difficult, does it?

@leerspace
Copy link
Contributor

@voidzero There's an open issue for limiting bandwidth here: /issues/3065.

@voidzero
Copy link

voidzero commented Jan 8, 2018

Ah, perfect. Thanks @leerspace; much obliged.

@nicolas-f
Copy link

@voidzero You can use trickle to limit bandwidth.

I'm using this command to limit to 50 kb/s upload in my crontab:

@reboot /usr/bin/trickle -s -u 50 -d 1000 /usr/local/bin/ipfs daemon --routing=dhtclient 1> /home/ipfsnode/ipfsnode.log 2> /home/ipfsnode/ipfsnode.err.log

@vext01
Copy link

vext01 commented Sep 22, 2018

I'm not sure if it's bandwidth of the sheer number of connections to peers, but running IPFS on my home network renders my internet connection unusable.

I've not tried trickle yet, but i'd prefer way to say "please only connect to 50 peers". The watermark settings don't seem to allow this...

@EternityForest
Copy link

EternityForest commented Sep 22, 2018 via email

@Stebalien
Copy link
Member

I'm not sure if it's bandwidth of the sheer number of connections to peers, but running IPFS on my home network renders my internet connection unusable.

...

What about adding a third state where a peer is registered but not connected, where it's only pinged at exponentially increasing intervals up to a day or so, to ensure that it's still there.

See discussion here: #3320. It's the number of connections.

Our current solution to this is QUIC (which go-ipfs now has experimental support for). It's a UDP based protocol so, at the protocol level, it has no connections. The hope is that this will convince routers to STOP TRYING TO DO SMART THINGS AND JUST ROUTE THE PACKETS DAMMIT!

@whyrusleeping
Copy link
Member

@vext01

but i'd prefer way to say "please only connect to 50 peers". The watermark settings don't seem to allow this...

Thats the whole point of the watermark settings. If you want a hard cap at 50, set the highWater value to 50.

@Stebalien
Copy link
Member

@whyrusleeping that doesn't quite work as it doesn't prevent new connections. I think he just wants max 50 connections open at any given time.

@magik6k magik6k added the topic/perf Performance label Nov 4, 2018
@vext01
Copy link

vext01 commented Dec 12, 2018

I think he just wants max 50 connections open at any given time.

Correct. The watermark doesn't seem to prevent new connections, so you still end up with hundreds of sockets open.

I'm still unable to use IPFS on my home network :(

@ItalyPaleAle
Copy link

I actually have the same issue. This is my IPFS node over the last 30 days:

IPFS node bandwidth

It's quite insane, considering that the nodes are serving just a bunch of static HTML files (in total, the shared data is less than 5 MB) and that there are only 5 people accessing that data, each person around once a day, through Cloudflare (which caches the data too).

@ItalyPaleAle
Copy link

Update: I ran the same commands as @wrouesnel and here's the result for me. My nodes are still using 400-500 GB per month, both in ingress and egress (ingress is usually higher).

/ # for p in /ipfs/bitswap/1.1.0 /ipfs/dht /ipfs/bitswap /ipfs/bitswap/1.0.0 /ipfs/kad/1.0.0 ; do echo ipfs stats bw --prot
o $p && ipfs stats bw --proto $p && echo "---" ; done
ipfs stats bw --proto /ipfs/bitswap/1.1.0
Bandwidth
TotalIn: 632 MB
TotalOut: 5.6 MB
RateIn: 9.6 kB/s
RateOut: 13 B/s
---
ipfs stats bw --proto /ipfs/dht
Bandwidth
TotalIn: 937 kB
TotalOut: 7.8 MB
RateIn: 0 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/bitswap
Bandwidth
TotalIn: 97 MB
TotalOut: 2.5 kB
RateIn: 0 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/bitswap/1.0.0
Bandwidth
TotalIn: 0 B
TotalOut: 0 B
RateIn: 0 B/s
RateOut: 0 B/s
---
ipfs stats bw --proto /ipfs/kad/1.0.0
Bandwidth
TotalIn: 1.1 GB
TotalOut: 1.5 GB
RateIn: 12 kB/s
RateOut: 8.3 kB/s

Routing is set to "dht" and not "dhtclient", but I am still going to change it and see if it makes any difference.

Any idea what might be causing all that traffic? The node isn't hosting a lot of data and traffic to documents that are pinned by the node should be very low...

@whyrusleeping
Copy link
Member

@ItalyPaleAle looks like dht traffic and bitswap wantlist broadcasts. These are both greatly improved in 0.4.19 (not yet released, but latest master has all the changes), i would recommend updating. The more everyone else upgrades the better it will get.

@ItalyPaleAle
Copy link

@whyrusleeping Glad to hear about 0.4.19. This is a "production" node so I'd rather not run something from master, so I'll wait for the update (I'm using Docker btw)

Just to confirm I understood correctly:

  1. DHT traffic could be reduced by switching routing to "dhtclient". I've switched it on one of my three nodes (that are in a cluster so serve the same data) and check in a day or two if it made any difference.
  2. KAD traffic is the actual data being shared?
  3. Not sure what bitswap is for, but glad to hear you're improving that.

@whyrusleeping
Copy link
Member

@ItalyPaleAle yeah, dht traffic can be reduced by setting your node to be just a dht client. Much of the traffic i'm seeing in your log is DHT traffic.

Bitswap traffic is mostly other peers telling you what they want, as the current mechanism for finding data is a broadcast to all connected peers of what you want. thats greatly improved in 0.4.19

@ItalyPaleAle
Copy link

@whyrusleeping the bigges traffic (1.1GB in and 1.5GB out) is actually from Kad. That's over half of the total traffic. I guess those are actual files I'm serving?

@whyrusleeping
Copy link
Member

@ItalyPaleAle no, actual files are served over bitswap. All kad traffic is just announcements and searching.

@Clay-Ferguson
Copy link

Clay-Ferguson commented Apr 14, 2019

Seems to me like definitely "--routing=dhtclient" should be the default setting. When developers first get started using this technology they're not going to know every 'gotcha', and we don't want them getting slammed by unexpected massive bandwidth consumption. That will create a very negative (and potentially costly) experience for nearly 100% of early adopters.

For the those who really do want to participate as a DHT server, they can be expected to figure out how to turn that setting on.

@EternityForest
Copy link

Hey guys! I've been playing around with some mesh projects and was reminded of this ongoing project and had a random question.

Currently, as I understand it, IPFS stores DHT records on the N nodes that are closest to the hash, selected from the entire set of all nodes.

Does any concept of a "sub-DHT" exist yer? It seems that if there is some shared set of peers that most of your peers have, there's no real need to flood wantlists to everyone, you can just have all nodes store a record on the closest node within that "sub-DHT", because it's only a 1-hop lookup for any node in the group that uses it.

Treating nodes with nearby IDs as a sub-DHT, and everyone on your local network as another,
plus all recently connected peers as a third, might work just as well as flooding.

Sending fewer wantlists to other nodes would increase privacy, and using sub-DHTs wouldn't require any extra trusted nodes. You'd have some issues like needing a way to keep one node with tons of records from majorly flooding it's peer group, but eliminating wantlist floods would do a lot.

You'd also have issues with different nodes having different sized sets of peers and not overlapping perfectly, but the overall performance gain might still be better, and you could always have a flood mode for people who really cared about latency.

It would also open the possibility for things like geographical node IDs. Convert your node is to GPS coordinates, then keep generating till you get one that's within a few miles.

Since you mostly share the same sub-DHT as your true geographical peers, fetching locally generated content might get a bit of a boost. Or use "vanity" addresses, and you might have a decent chance of being in a sub-DHT with someone who wants the blocks you have.

Perhaps you could even reduce the replication factor dynamically, for things like Linux disk images where there's already tons of peers everywhere, and you really just want the nearby ones if possible.

@Stebalien
Copy link
Member

@EternityForest

Does any concept of a "sub-DHT" exist yer?

0.5.0 will have a lan-only DHT but records won't be advertised to it if we're also connected to the WAN DHT. We were concerned that flooding the LAN DHT with records would be a problem for asymmetric networks (e.g., a network with a heavily loaded server storing a lot of content and, say, a laptop, smartphone, etc.).

Other than that, you may be interested in some of the discussion around ipfs/notes#291.

flooding

What do you mean by "flooding"?

Bitswap

Before trying the DHT, we ask all of our already connected nodes. Is this what you meant by "flood"? We do this because querying connected peers tends to be faster (they also tend to be the peers that have useful content.

However, we could and should get smarter about this. We should ideally have some form of staggered flood where we send messages to the peers most likely to have the content first.

Also note, when this issue was created, bitswap would send every wantlist item to every peer always. This has since changed (#3786). Now, once we've started receiving blocks, we only ask peers in the session for more blocks.

DHT

When trying to find content, we traverse the DHT in an ordered search. We don't (intentionally) flood the DHT. (note: I say intentionally because prior to the release coming out tomorrow (0.5.0), go-ipfs would connect to a large portion of the DHT for every DHT request due to some bugs).


OT: This discussion is probably best had on the forums (https://discuss.ipfs.io).

I'm going to close this issue is as it spans a very large period of time. IPFS has changed quite a bit over that period of time.

@ctrlcctrlv
Copy link

ctrlcctrlv commented Dec 13, 2021

I couldn't find any simple advice to just start go-ipfs with trickle in systemd, so I added some to Arch Wiki. It's generally useful for Linux users though, as most Linuxes use Systemd.

Starting the service with a different command line

You may also want to limit the bandwidth IPFS uses by using trickle. (Cf. #3429.) You can write this Systemd/User service file to $HOME/.config/systemd/user/go-ipfs.service:

[Unit]
Description=InterPlanetary File System (IPFS) daemon (rate-limited via Trickle)
After=network-online.target
Wants=network-online.target

[Service]
ExecStart=/usr/bin/trickle -s -u 56 /usr/bin/ipfs daemon --routing=dhtclient
Restart=on-failure

[Install]
WantedBy=default.target

This will both start IPFS with trickle and pass the argument --routing=dhtclient. You may of course modify it as needed, or base your version on the package's /usr/lib/systemd/user/ipfs.service.

systemctl --user enable --now go-ipfs

@Clay-Ferguson
Copy link

In case this helps any GO-IPFS (via. docker) users:

My solution was to use the deploy.resources.limits feature of Docker Compose (an ugly solution, but works), as shown here:

https://github.com/Clay-Ferguson/quantizr/blob/master/docker-compose-dev.yaml

But actually it might be better to simply put the trickle command into my actual docker file, which in my case is here (which I haven't tried yet):
https://github.com/Clay-Ferguson/quantizr/blob/master/dockerfile-ipfs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
need/analysis Needs further analysis before proceeding topic/perf Performance
Projects
None yet
Development

No branches or pull requests