1 files changed, 67 insertions, 51 deletions
diff --git a/src/Network/BitTorrent/Internal.lhs b/src/Network/BitTorrent/Internal.lhs
index 078920a4..8936f507 100644
--- a/src/Network/BitTorrent/Internal.lhs
+++ b/src/Network/BitTorrent/Internal.lhs
@@ -114,9 +114,8 @@
 > import Network.BitTorrent.Exchange.Protocol as BT
 > import Network.BitTorrent.Tracker.Protocol as BT
-> {-----------------------------------------------------------------------
+    Progress
->     Progress
+------------------------------------------------------------------------
-> -----------------------------------------------------------------------}
 > -- | 'Progress' contains upload/download/left stats about
 > --   current client state and used to notify the tracker.
@@ -131,17 +130,18 @@
 >   , _downloaded :: !Integer -- ^ Total amount of bytes downloaded.
 >   , _left       :: !Integer -- ^ Total amount of bytes left.
 >   } deriving (Show, Read, Eq)
+>
+> $(makeLenses ''Progress)
-> -- TODO use atomic bits and Word64
+TODO use Word64?
+TODO use atomic bits?
-> $(makeLenses ''Progress)
+Please note that tracker might penalize client some way if the do
+not accumulate progress. If possible and save 'Progress' between
+client sessions to avoid that.
 > -- | Initial progress is used when there are no session before.
-> --
-> --   Please note that tracker might penalize client some way if the do
-> --   not accumulate progress. If possible and save 'Progress' between
-> --   client sessions to avoid that.
-> --
 > startProgress :: Integer -> Progress
 > startProgress = Progress 0 0
@@ -169,7 +169,7 @@
 > {-# INLINE dequeuedProgress #-}
-Thread layout
+    Thread layout
 ------------------------------------------------------------------------
 When client session created 2 new threads appear:
@@ -196,7 +196,7 @@ So for e.g., in order to obtain our first block we need to run at
 least 7 threads: main thread, 2 client session threads, 3 swarm session
 threads and PeerSession thread.
-Thread throttling
+    Thread throttling
 ------------------------------------------------------------------------
 If we will not restrict number of threads we could end up
@@ -207,6 +207,9 @@ strategy because each swarm might have say 1 thread and we could end
 up bounded by the meaningless limit. Bounding global number of p2p
 sessions should work better, and simpler.
+**TODO:** priority based throttling: leecher thread have more priority
+than seeder threads.
 > -- | Each client might have a limited number of threads.
 > type ThreadCount = Int
@@ -214,9 +217,24 @@ sessions should work better, and simpler.
 > defaultThreadCount :: ThreadCount
 > defaultThreadCount = 1000
-Torrent Map
+    Torrent Map
 ------------------------------------------------------------------------
+Keeping all seeding torrent metafiles in memory is a _bad_ idea: for
+1TB of data we need at least 100MB of metadata. (using 256KB piece
+size). This solution do not scale further.
+To avoid this we keep just *metainfo* about *metainfo*:
+> -- | Local info about torrent location.
+> data TorrentLoc = TorrentLoc {
+>     -- | Full path to .torrent metafile.
+>     metafilePath :: FilePath
+>     -- | Full path to directory contating content files associated
+>     -- with the metafile.
+>   , dataPath     :: FilePath
+>   }
 TorrentMap is used to keep track all known torrents for the
 client. When some peer trying to connect to us it's necessary to
 dispatch appropriate 'SwarmSession' (or start new one if there are
@@ -225,53 +243,51 @@ but nothing more. So to accept new 'PeerSession' we need to lookup
 torrent metainfo and content files (if there are some) by the
 'InfoHash' and only after that enter exchange loop.
-*PERFORMANCE NOTE:* keeping torrent metafiles in memory is a _bad_
+Solution with TorrentLoc is much better and takes much more less
-idea: for 1TB of data we need at least 100MB of metadata. (using 256KB
+space, moreover it depends on count of torrents but not on count of
-piece size). This solution do not scale further. Solution with
+data itself. To scale further, in future we might add something like
-TorrentLoc is much better and takes much more less space, moreover it
+database (for e.g. sqlite) for this kind of things.
-depends on count of torrents but not on count of data itself. To scale
-further, in future we might add something like database (for
-e.g. sqlite) for this kind of things.
-> -- | Local identification info location about
+> -- | Used to find torrent info and data in order to accept connection.
-> data TorrentLoc = TorrentLoc {
+> type TorrentMap = HashMap InfoHash TorrentLoc
->     -- |
->     metafilePath :: FilePath
+While *registering* torrent we need to check if torrent metafile is
->   , dataPath     :: FilePath
+correct, all the files are present in the filesystem and so
->   }
+forth. However content validation using hashes will take a long time,
+so we need to do this on demand: if a peer asks for a block, we
+validate corresponding piece and only after read and send the block
+back.
+> -- | Used to check torrent location before register torrent.
 > validateTorrent :: TorrentLoc -> IO ()
 > validateTorrent = error "validateTorrent: not implemented"
+    Client session
+------------------------------------------------------------------------
-> --
+Basically, client session should contain options which user
-> type TorrentMap = HashMap InfoHash TorrentLoc
+application store in configuration files and related to the
+protocol. Moreover it should contain the all client identification
+info, for e.g. DHT.
-Client session
+Client session is the basic unit of bittorrent network, it has:
------------------------------------------------------------------------
-Basically, client session should contain options which user app store
+  * The /peer ID/ used as unique identifier of the client in
-in configuration files. (related to the protocol) Moreover it should
+network. Obviously, this value is not changed during client session.
-contain the all client identification info. (e.g. DHT)
-> -- | Client session is the basic unit of bittorrent network, it has:
+  * The number of /protocol extensions/ it might use. This value is
-> --
+static as well, but if you want to dynamically reconfigure the client
-> --     * The /peer ID/ used as unique identifier of the client in
+you might kill the end the current session and create a new with the
-> --     network. Obviously, this value is not changed during client
+fresh required extensions.
-> --     session.
-> --
+  * The number of /swarms/ to join, each swarm described by the
-> --     * The number of /protocol extensions/ it might use. This value
+'SwarmSession'.
-> --     is static as well, but if you want to dynamically reconfigure
-> --     the client you might kill the end the current session and
+Normally, you would have one client session, however, if we need, in
-> --     create a new with the fresh required extensions.
+one application we could have many clients with different peer ID's
-> --
+and different enabled extensions at the same time.
-> --     * The number of /swarms/ to join, each swarm described by the
-> --     'SwarmSession'.
+> -- |
-> --
-> --  Normally, you would have one client session, however, if we need,
-> --  in one application we could have many clients with different peer
-> --  ID's and different enabled extensions at the same time.
-> --
 > data ClientSession = ClientSession {
 >     -- | Used in handshakes and discovery mechanism.
 >     clientPeerId      :: !PeerId

diff --git a/src/Network/BitTorrent/Internal.lhs b/src/Network/BitTorrent/Internal.lhs index 078920a4..8936f507 100644 --- a/src/Network/BitTorrent/Internal.lhs +++ b/src/Network/BitTorrent/Internal.lhs
@@ -114,9 +114,8 @@
114	> import Network.BitTorrent.Exchange.Protocol as BT	114	> import Network.BitTorrent.Exchange.Protocol as BT
115	> import Network.BitTorrent.Tracker.Protocol as BT	115	> import Network.BitTorrent.Tracker.Protocol as BT
116		116
117	> {-----------------------------------------------------------------------	117	Progress
118	> Progress	118	------------------------------------------------------------------------
119	> -----------------------------------------------------------------------}
120		119
121	> -- \| 'Progress' contains upload/download/left stats about	120	> -- \| 'Progress' contains upload/download/left stats about
122	> -- current client state and used to notify the tracker.	121	> -- current client state and used to notify the tracker.
@@ -131,17 +130,18 @@
131	> , _downloaded :: !Integer -- ^ Total amount of bytes downloaded.	130	> , _downloaded :: !Integer -- ^ Total amount of bytes downloaded.
132	> , _left :: !Integer -- ^ Total amount of bytes left.	131	> , _left :: !Integer -- ^ Total amount of bytes left.
133	> } deriving (Show, Read, Eq)	132	> } deriving (Show, Read, Eq)
		133	>
		134	> $(makeLenses ''Progress)
134		135
135	> -- TODO use atomic bits and Word64	136	TODO use Word64?
		137	TODO use atomic bits?
136		138
137	> $(makeLenses ''Progress)	139
		140	Please note that tracker might penalize client some way if the do
		141	not accumulate progress. If possible and save 'Progress' between
		142	client sessions to avoid that.
138		143
139	> -- \| Initial progress is used when there are no session before.	144	> -- \| Initial progress is used when there are no session before.
140	> --
141	> -- Please note that tracker might penalize client some way if the do
142	> -- not accumulate progress. If possible and save 'Progress' between
143	> -- client sessions to avoid that.
144	> --
145	> startProgress :: Integer -> Progress	145	> startProgress :: Integer -> Progress
146	> startProgress = Progress 0 0	146	> startProgress = Progress 0 0
147		147
@@ -169,7 +169,7 @@
169	> {-# INLINE dequeuedProgress #-}	169	> {-# INLINE dequeuedProgress #-}
170		170
171		171
172	Thread layout	172	Thread layout
173	------------------------------------------------------------------------	173	------------------------------------------------------------------------
174		174
175	When client session created 2 new threads appear:	175	When client session created 2 new threads appear:
@@ -196,7 +196,7 @@ So for e.g., in order to obtain our first block we need to run at
196	least 7 threads: main thread, 2 client session threads, 3 swarm session	196	least 7 threads: main thread, 2 client session threads, 3 swarm session
197	threads and PeerSession thread.	197	threads and PeerSession thread.
198		198
199	Thread throttling	199	Thread throttling
200	------------------------------------------------------------------------	200	------------------------------------------------------------------------
201		201
202	If we will not restrict number of threads we could end up	202	If we will not restrict number of threads we could end up
@@ -207,6 +207,9 @@ strategy because each swarm might have say 1 thread and we could end
207	up bounded by the meaningless limit. Bounding global number of p2p	207	up bounded by the meaningless limit. Bounding global number of p2p
208	sessions should work better, and simpler.	208	sessions should work better, and simpler.
209		209
		210	TODO: priority based throttling: leecher thread have more priority
		211	than seeder threads.
		212
210	> -- \| Each client might have a limited number of threads.	213	> -- \| Each client might have a limited number of threads.
211	> type ThreadCount = Int	214	> type ThreadCount = Int
212		215
@@ -214,9 +217,24 @@ sessions should work better, and simpler.
214	> defaultThreadCount :: ThreadCount	217	> defaultThreadCount :: ThreadCount
215	> defaultThreadCount = 1000	218	> defaultThreadCount = 1000
216		219
217	Torrent Map	220	Torrent Map
218	------------------------------------------------------------------------	221	------------------------------------------------------------------------
219		222
		223	Keeping all seeding torrent metafiles in memory is a _bad_ idea: for
		224	1TB of data we need at least 100MB of metadata. (using 256KB piece
		225	size). This solution do not scale further.
		226
		227	To avoid this we keep just metainfo about metainfo:
		228
		229	> -- \| Local info about torrent location.
		230	> data TorrentLoc = TorrentLoc {
		231	> -- \| Full path to .torrent metafile.
		232	> metafilePath :: FilePath
		233	> -- \| Full path to directory contating content files associated
		234	> -- with the metafile.
		235	> , dataPath :: FilePath
		236	> }
		237
220	TorrentMap is used to keep track all known torrents for the	238	TorrentMap is used to keep track all known torrents for the
221	client. When some peer trying to connect to us it's necessary to	239	client. When some peer trying to connect to us it's necessary to
222	dispatch appropriate 'SwarmSession' (or start new one if there are	240	dispatch appropriate 'SwarmSession' (or start new one if there are
@@ -225,53 +243,51 @@ but nothing more. So to accept new 'PeerSession' we need to lookup
225	torrent metainfo and content files (if there are some) by the	243	torrent metainfo and content files (if there are some) by the
226	'InfoHash' and only after that enter exchange loop.	244	'InfoHash' and only after that enter exchange loop.
227		245
228	PERFORMANCE NOTE: keeping torrent metafiles in memory is a _bad_	246	Solution with TorrentLoc is much better and takes much more less
229	idea: for 1TB of data we need at least 100MB of metadata. (using 256KB	247	space, moreover it depends on count of torrents but not on count of
230	piece size). This solution do not scale further. Solution with	248	data itself. To scale further, in future we might add something like
231	TorrentLoc is much better and takes much more less space, moreover it	249	database (for e.g. sqlite) for this kind of things.
232	depends on count of torrents but not on count of data itself. To scale
233	further, in future we might add something like database (for
234	e.g. sqlite) for this kind of things.
235		250
236	> -- \| Local identification info location about	251	> -- \| Used to find torrent info and data in order to accept connection.
237	> data TorrentLoc = TorrentLoc {	252	> type TorrentMap = HashMap InfoHash TorrentLoc
238	> -- \|	253
239	> metafilePath :: FilePath	254	While registering torrent we need to check if torrent metafile is
240	> , dataPath :: FilePath	255	correct, all the files are present in the filesystem and so
241	> }	256	forth. However content validation using hashes will take a long time,
		257	so we need to do this on demand: if a peer asks for a block, we
		258	validate corresponding piece and only after read and send the block
		259	back.
242		260
		261	> -- \| Used to check torrent location before register torrent.
243	> validateTorrent :: TorrentLoc -> IO ()	262	> validateTorrent :: TorrentLoc -> IO ()
244	> validateTorrent = error "validateTorrent: not implemented"	263	> validateTorrent = error "validateTorrent: not implemented"
245		264
		265	Client session
		266	------------------------------------------------------------------------
246		267
247	> --	268	Basically, client session should contain options which user
248	> type TorrentMap = HashMap InfoHash TorrentLoc	269	application store in configuration files and related to the
		270	protocol. Moreover it should contain the all client identification
		271	info, for e.g. DHT.
249		272
250	Client session	273	Client session is the basic unit of bittorrent network, it has:
251	------------------------------------------------------------------------
252		274
253	Basically, client session should contain options which user app store	275	* The /peer ID/ used as unique identifier of the client in
254	in configuration files. (related to the protocol) Moreover it should	276	network. Obviously, this value is not changed during client session.
255	contain the all client identification info. (e.g. DHT)
256		277
257	> -- \| Client session is the basic unit of bittorrent network, it has:	278	* The number of /protocol extensions/ it might use. This value is
258	> --	279	static as well, but if you want to dynamically reconfigure the client
259	> -- * The /peer ID/ used as unique identifier of the client in	280	you might kill the end the current session and create a new with the
260	> -- network. Obviously, this value is not changed during client	281	fresh required extensions.
261	> -- session.	282
262	> --	283	* The number of /swarms/ to join, each swarm described by the
263	> -- * The number of /protocol extensions/ it might use. This value	284	'SwarmSession'.
264	> -- is static as well, but if you want to dynamically reconfigure	285
265	> -- the client you might kill the end the current session and	286	Normally, you would have one client session, however, if we need, in
266	> -- create a new with the fresh required extensions.	287	one application we could have many clients with different peer ID's
267	> --	288	and different enabled extensions at the same time.
268	> -- * The number of /swarms/ to join, each swarm described by the	289
269	> -- 'SwarmSession'.	290	> -- \|
270	> --
271	> -- Normally, you would have one client session, however, if we need,
272	> -- in one application we could have many clients with different peer
273	> -- ID's and different enabled extensions at the same time.
274	> --
275	> data ClientSession = ClientSession {	291	> data ClientSession = ClientSession {
276	> -- \| Used in handshakes and discovery mechanism.	292	> -- \| Used in handshakes and discovery mechanism.
277	> clientPeerId :: !PeerId	293	> clientPeerId :: !PeerId