构建字符串图（Levenshtein距离）

当前，在我的计算机科学课程中，我们正在讨论图以及如何使用图找出最短距离。大约一周前，我收到了一份作业，老师给我们提供了使用整数的图形代码，我们必须调整它才能使用单词列表计算Levenshtein距离。我遇到的问题是，我不明白图表如何处理足够的操作。我已经尝试过使用谷歌搜索图表，但没有发现类似于我给出的程序类型。构建字符串图（Levenshtein距离）

我们刚刚在链表上完成了一个单元，并且我认为图的操作方式相似吗？我知道每个节点都会指向许多其他节点，但是在我有2000个字都指向对方的情况下，如何在没有在结构中声明多个节点的情况下跟踪每个节点2000个指针？我相信（不是100％），在我被授予的课程中，我的老师使用了一个整数向量向量来跟踪，但我不知道如何实现。

我不是要求任何人充分评论每一行，因为这是一项巨大的工作，但如果有人可以粗略地解释我将如何完成我上面提到的问题，也许阅读代码并给我一个粗略的理解部分章节意味着什么（我将对某些章节发表评论，我特别不理解）我将非常感激。

这是我们得到的代码：

#include <iostream> 
#include <vector> 
#include <algorithm> //for max<> 
#include <limits> 

using namespace std; 

typedef vector <int> ivec; 
typedef vector <ivec> imatrix; //A vector of vectors, not how this works or how to implement 
typedef vector <bool> bvec; 

struct graph 
{ 
    imatrix edges; //list of attached vertices for each node 
    int numVertices; 
}; 

//I understand the ostream overloading 
ostream & operator << (ostream & stream, ivec &vec) 
{ 
    for (int i = 0; i < vec.size(); i++) 
    { 
     stream << vec[i] << " "; 
    } 
    return stream; 
} 

ostream & operator << (ostream & stream, graph &g) 
{ 
    stream << endl << "numVert = " << g.numVertices << endl; 
    for (int i = 0; i < g.numVertices; i++) 
    { 
     stream << "vertex = " << i+1 << " | edges = " << g.edges[i] << endl; 
    } 
    return stream; 
} 

const int sentinel = -1; 

bvec inTree; 
ivec distanceNodes; 
ivec parents; 

void initGraph(graph * g); 
void insertEdge(graph * g, int nodeNum, int edgeNum); 
void initSearch(graph * g); 
void shortestPath(graph * g, int start, int end); 

int main() 
{ 
    //I understand the main, the two numbers in insertEdge are being hooked together and the two numbers in shortestPath are what we are looking to connect in the shortest way possible 
    graph g; 
    initGraph(&g); 
    insertEdge(&g, 1, 2); 
    insertEdge(&g, 1, 3); 
    insertEdge(&g, 2, 1); 
    insertEdge(&g, 2, 3); 
    insertEdge(&g, 2, 4); 
    insertEdge(&g, 3, 1); 
    insertEdge(&g, 3, 2); 
    insertEdge(&g, 3, 4); 
    insertEdge(&g, 4, 2); 
    insertEdge(&g, 4, 3); 
    insertEdge(&g, 4, 5); 
    insertEdge(&g, 5, 4); 
    insertEdge(&g, 6, 7); 
    insertEdge(&g, 7, 6); 
    cout << "The graph is " << g << endl; 
    shortestPath(&g, 1, 5); 
    shortestPath(&g, 2, 4); 
    shortestPath(&g, 5, 2); 
    shortestPath(&g, 1, 7); 
    return 0; 
} 

void initGraph(graph * g) 
{ 
    g -> numVertices = 0; //Why set the number of vertices to 0? 
} 

void insertEdge(graph * g, int nodeNum, int edgeNum) 
{ 
    int numVertices = max(nodeNum, edgeNum); //Max finds the larger of two numbers I believe? How can this be used with strings, one is not bigger than the other 
    numVertices = max(1, numVertices); 
    if (numVertices > g->numVertices) 
    { 
     for (int i = g->numVertices; i <= numVertices; i++) 
     { 
      ivec nodes; 
      if (g->edges.size() < i) 
      { 
       g -> edges.push_back(nodes); 
      } 
     } 
     g->numVertices = numVertices; 
    } 
    g->edges[nodeNum - 1].push_back(edgeNum); 
} 

void initSearch(graph * g) //I believe this function simply resets the values from a previous search 
{ 
    if (g == NULL) 
    { 
     return; 
    } 
    inTree.clear(); 
    distanceNodes.clear(); 
    parents.clear(); 
    for (int i = 0; i <= g->numVertices; i++) 
    { 
     inTree.push_back(false); 
     distanceNodes.push_back(numeric_limits <int> :: max()); 
     parents.push_back(sentinel); 
    } 
} 

void shortestPath(graph * g, int start, int end) 
{ 
    //Very confused about how this function works 
    initSearch(g); 
    int edge; 
    int curr; //current node 
    int dist; 
    distanceNodes[start] = 0; 
    curr = start; 
    while (! inTree[curr]) 
    { 
     inTree[curr] = true; 
     ivec edges = g->edges[curr - 1]; 
     for (int i = 0; i < edges.size(); i++) 
     { 
      edge = edges[i]; 

      if (distanceNodes[edge] > distanceNodes[curr] + 1) 
      { 
       distanceNodes[edge] = distanceNodes[curr] + 1; 
       parents[edge] = curr; 
      } 
     } 
     curr = 1; 
     dist = numeric_limits <int> :: max(); 
     for (int i = 1; i <= g->numVertices; i++) 
     { 
      if ((!inTree[i]) && (dist > distanceNodes[i])) 
      { 
       dist = distanceNodes[i]; 
       curr = i; 
      } 
     } 
    } 
    ivec path; 
    if (distanceNodes[end] == numeric_limits <int> :: max()) //is there a numeric_limits <string> :: max? 
    { 
     cout << "No way from " << start << " to " << end << endl; 
    } 
    else 
    { 
     int temp = end; 
     while (temp != start) 
     { 
      path.push_back(temp); 
      temp = parents[temp]; 
     } 
     path.push_back(start); 
     reverse(path.begin(), path.end()); 
     cout << "From " << start << " to " << end << " is " << path << endl; 
    } 
}

如果你能帮忙，那将是最欢迎的，因为我很可能会与图表的详细项目，我挣扎，由于不理解他们。

谢谢特里斯坦

来源

2014-05-14 Tristan

一些答案：

问：你问为什么numVertices在下面设置为0：在g的声明

void initGraph(graph * g) 
{ 
    g -> numVertices = 0; //Why set the number of vertices to 0? 
}

A.看 - 它默认初始化为：

int main() 
{ 
    graph g; 
    .... 
}

现在看defi图表的名称 - 它没有构造函数：

struct graph 
{ 
    imatrix edges; //list of attached vertices for each node 
    int numVertices; 
};

因此，默认情况下边缘会被正确初始化，因为矢量具有构造函数。但是numVertices是一个原始类型，所以它将包含任何随机值发生在该内存位置 - 这意味着它需要手动初始化。这就是为什么initGraph不需要初始化边缘，但它确实需要初始化numVertices。

问：你问如何可以找到大两个标准::串知道MAX（）返回两个整数中较大的：

int numVertices = max(nodeNum, edgeNum); //Max finds the larger of two numbers I believe? How can this be used with strings, one is not bigger than the other

A.根据http://www.cplusplus.com/reference/algorithm/max/ Max使用“功能用途运算符<（或comp，如果提供）来比较这些值。“但std :: strings可以使用<运算符进行比较，所以确实没有问题。

问：你问到向量的向量：

typedef vector <int> ivec; 
typedef vector <ivec> imatrix; //A vector of vectors, not how this works or how to implement

A.你可以用[]访问一个矢量，所以如果你有一个名为x的imatrix类型的变量，你可以说x [0]这将返回一个ivec（因为这是存储在一个imatrix矢量中的对象的类型，所以如果你说x [0] [0]，将返回存储在由x返回IVEC的第一个整数[0]要改变它使用一个字符串只是说：

typedef vector <std::string> ivec; 
typedef vector <ivec> imatrix;

你也可以重命名变量，如果你通缉。

您还需要#include <string>

来源

2014-05-14 02:18:35

谢谢，这是非常有帮助的！我有一个后续问题：你说max可以用来比较没有问题的字符串，你知道一个字符串比另一个字符串“大”的标准吗？它是基于长度还是字母值或别的东西？谢谢 – Tristan

typedef vector <ivec> imatrix; //A vector of vectors, not how this works or how to implement

这里图表被表示为Adjacency Matrix。您还可以使用Adjacency List来表示一个图形，其中每个节点将包含相邻节点的数组/链接列表。

g -> numVertices = 0; //Why set the number of vertices to 0?

它初始化图形，在启动时顶点/节点数量为零。当使用insertEdge方法插入边和节点时，该号码将被更新。

int numVertices = max(nodeNum, edgeNum); //Max finds the larger of two numbers I believe? How can this be used with strings, one is not bigger than the other

你虽然没有发布完整的代码，我认为最大的值用于插入边缘之前添加所需数量的顶点。

ivec nodes; 
if (g->edges.size() < i) 
{ 
    g -> edges.push_back(nodes); 
}

上面的代码插入新的顶点。您可能会在此处为您的版本执行integer comparison，而不是string，字符串是节点的数据，而不是节点的编号。如果你需要字符串比较，C++已经为此重载了运算符。

关于initSearch和最短路径方法，在这里后者发现使用的算法（我不知道是哪个，你可以搜索）节点之间的最短路径，并寻找最短路径，前者方法初始化之前将用于搜索的值。例如，它可以将每对节点之间的距离设置为infinity最初，当它们之间找到路径时，它将被更新。

来源

2014-05-14 02:32:08 Rakib

所以，就我的理解而言，邻接矩阵就像一个网格，其中每个节点在图中都有一排，并且每个节点可能具有的边的列都是连接，连接用1表示，非连接用0表示。它是否正确？ – Tristan

构建字符串图（Levenshtein距离）

回答

相关问题