How to make a partition structure

advertisements

Let's start with the segment [0, 10] stored in a list

[ [0, 10] ]

I received a set of ranges

[1,6]
[5, 8]

to partition the segment into the list

[ [0,1], [1,5], [5,6], [6, 8], [8, 10] ]

What will be a good data-structure/way to do that in python?

I don't know the terminology for this kind of task, so my google search are fruitless.

I could always brute force with numpy.searchsorted, but this will not be clean.
Especially that each sub-segment is in fact an object with a lot of properties.
And, I have several round of

creating sub-segment object / receiving ranges for further partitioning


I'm not sure how you'd like to query your data structure or what "properties" each segment has but given your example, a sorted set data structure would suffice. If we flatten your lists then we have:

initial = [0, 10]
...
final = [0, 1, 5, 6, 8, 10]

And we can transform final into your segments with:

segments = [final[pos:pos+1] for pos in xrange(len(final) - 1)]

So with each additional segment we combine that with something like:

next_iter = sorted(set(prev_iter + segment))

This would become expensive for large lists but there are data types that can help. A sorted set container maintains its elements as a set in sorted order. The sortedcontainers module provides a SortedSet data type for exactly this purpose:

from sortedcontainers import SortedSet

segments = SortedSet([0, 10])

def add_segment(start, end):
    segments.add(start)
    segments.add(end)

add_segment(1, 6)
add_segment(5, 8)

print segments
# SortedSet([0, 1, 5, 6, 8, 10])

A SortedSet supports fast indexing and bisecting so you can query like so:

print segments[2]
# 5

pos = segments.bisect(7)
print [segments[pos - 1], segments[pos]]
# [6, 8]